Transcript

1 INTRODUCTION

The term Video Graphics Array (VGA) refers specifically to the display

hardware first introduced with the IBM VGA is referred to as an array

instead of an adapter because it was implemented from the start as a single

chip replacing the dozens of discreet logic chips

The VGA Controller provides a simple interface between a host processor

and any VGA-compatible monitor

Taking a processor-generated picture (pixilated) from memory space the

Controller provides digital RGB values for each pixel as well as horizontal

and vertical synchronization signals in order to correctly display the picture on

a connected monitor

1

2 HISTORY

The term Video Graphics Array (VGA) refers specifically to the display

hardware first introduced with the IBM PS2 line of computers in 1987 but

through its widespread adoption has also come to mean either an analog

computer display standard the 15-pin D-subminiature VGA connector or the

640times480 resolution itself While this resolution has been superseded in the

personal computer market it is becoming a popular resolution on mobile

devices

Video Graphics Array (VGA) was the last graphical standard introduced by

IBM that the majority of PC clone manufacturers conformed to making it

today (as of 2009) the lowest common denominator that all PC graphics

hardware supports before a device-specific driver is loaded into the computer

For example the MS-Windows splash screen appears while the machine is still

operating in VGA mode which is the reason that this screen always appears in

reduced resolution and color depth

VGA was officially superseded by IBMs XGA standard but in reality it was

superseded by numerous slightly different extensions to VGA made by clone

manufacturers that came to be known collectively as Super VGA

21 VGA compared to other standard resolutions

VGA is referred to as an array instead of an adapter because it was

implemented from the start as a single chip (an ASIC) replacing the Motorola

6845 and dozens of discrete logic chips that covered the full-length ISA boards

of the MDA CGA and EGA Its single-chip implementation also allowed the

VGA to be placed directly on a PCs motherboard with a minimum of

difficulty (it only required video memory timing crystals and an external

RAMDAC) and the first IBM PS2 models were equipped with VGA on the

motherboard

2

22 The VGA specifications are as follows

256 KB Video RAM (The very first cards could be ordered with 64KB or

128KB of RAM at the cost of losing some video modes)

16-color and 256-color modes

262144-value color palette (six bits each for red green and blue)

Selectable 25175 MHz or 28322 MHz master clock

Maximum of 800 horizontal pixels

Maximum of 600 lines

Refresh rates at up to 70 Hz

Vertical blank interrupt

Packed-pixel mode 256 colors

Hardware smooth scrolling support

The VGA supports both All Points Addressable graphics modes and

alphanumeric text modes Standard graphics modes are

640times480 in 16 colors

640times350 in 16 colors

320times200 in 16 colors

320times200 in 256 colors

3

3 DESCRIPTION ARCHITECTURE

31 VGA display device

VGA display device used for displaying the images taken from the

processor in an exact format It can be used for displaying the output of any

projects Analog graphs etchellip can be directly displayed on the VGA display

monitor

31 VGA (Video Graphics Array) Controller

VGA controller controls the image to be exactly displayed on the display

device It acts as an interface between processing system and display device so

called as interface controller

4

Image or video processing system

VGA Controller

Display

4 CONTROLLER DESIGN

For this first we should know how the display device works Consider an

image below

640x480

Original image

5

Pixels of image

Display device also have pixels Display resolution (640480) is a

standard resolution of display

Pixel

This article is about the picture element For other uses see Pixel

(disambiguation)

This example shows an image with a portion greatly enlarged in which the

individual pixels are rendered as little squares and can easily be seen

6

42 A photograph of sub-pixel display elements on a laptops LCD screen

In digital imaging a pixel (or picture element [1]) is a single point in a raster

image The pixel is the smallest addressable screen element it is the smallest

unit of picture which can be controlled Each Pixel has its address The address

of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

dimensional grid and are often represented using dots or squares Each pixel is

a sample of an original image where more samples typically provide more-

accurate representations of the original The intensity of each pixel is variable

In color image systems a color is typically represented by three or four

component intensities such as red green and blue or cyan magenta yellow

and black

Color refreshing rate

Color refreshing rate =60Hz

Color refreshing rate represents the number of frames that are transmitted per

second

Minimum number of frames that can be transmitted per second by VGA

controller is 60

7

5 Hardware Description

51 Block Diagram

51 VGA Synchronization Unit

8

The Synchronization Unit provides the horizontal and vertical

synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

display a picture frame within the confines of a monitorrsquos display area

These synchronization signals are used as control inputs by the

monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

electrons emitted by the three primary color electron guns (Red Green Blue)

left to right and from top to bottom respectively HSYNC provides the start

and stop times for the horizontal deflection circuit so that a line of pixels is

correctly drawn across the screen display VSYNC provides the start and stop

times for the vertical deflection circuit so that the lines of a frame are correctly

drawn from the top to the bottom of the screen display

The resolution for the display is defined by the level on the

RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

the 800x600 resolution (SVGA) is used

Although the resolution determines the area of a monitorrsquos screen

within which an image can be displayed the full extents of the chosen

resolution do not have to be used The actual extents of the image display area

on the screen can be controlled by the use of the DISPSIZE_H and

DISPSIZE_V inputs These inputs determine the total number of pixels to be

used in a line and the total number of lines in a frame respectively

52 Blank pixel generation

The total number of viewable pixels in a line and viewable lines in a

frame is determined by the display resolution chosen through the

RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

chosen display resolution is used or not the areas of the monitor screen to the

top bottom left and right of the viewable frame area are blanked by putting

black pixels at the required line-pixel positions This has the effect of centering

the image on the screen

9

The color generated for a pixel in the Pixel Unit depends on whether

the particular pixel requires to be blanked or not The Synchronization Unit

provides a signal to the Pixel Unit for this very reason This is the line display

enable signal - en The signal is checked on each rising edge of the external

clock signal (CLK) and is set as follows

If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

en = 0 (pixel requires to be blanked ndash set color to be black)

Else

en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

6 SIGNALS IN VGA CONTROLLER

For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

timings are

Parameter Value Unit

Clock frequency 25175 MHz

Horizontal pixels 640

Horizontal sync polarity Negative

Total time for each line 3177 micros

Front porch (A) 094 micros

Sync pulse length (B) 377 micros

Back porch (C) 189 micros

Active video (D) 2517 micros

(Total horizontal sync time 660 micros)

10

61 The vertical timings are

Parameter Value Unit

Vertical lines 480

Vertical sync polarity Negative

Vertical frequency 5994 Hz

Front porch (E) 035 ms

Sync pulse length (F) 006 ms

Back porch (G) 102 ms

Active video (H) 1525 ms

(Total vertical sync time 143 ms)

640 x 400 70 Hz is video mode used for booting most x86 personal

computers

640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

It should be noted that the actual timings vary slightly For example for

640x480 60fps a 2517 micros active video time with a pixel frequency of

25174 MHz gives 633 pixels rather than the expected 640 pixels

62 Generation of the horizontal synchronization signal ndash HSYNC

11

The HSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated on each rising

edge of the external clock signal (CLK)

The state of the HSYNC signal depends on the value stored in the horizontal

counter and is driven low when

HCOUNT ge (PixTotal + BlankingLeft)

and remains low while

HCOUNT lt (PixTotal + BlankingLeft + TLSync)

Vertical (Frame) Period

63 Generation of the vertical synchronization signal - VSYNC

The VSYNC signal is High (inactive) after an external reset signal

(RST) is received by the VGA Controller The signal is updated after every

line of pixels is completed (ie on each rising edge of the HSYNC signal)

The state of the VSYNC signal depends on the value stored in the vertical

counter and is driven low when

VCOUNT ge (LinTotal + BlankingLeft)

and remains low while

VCOUNT lt (LinTotal + BlankingLeft + TFSync)

Address Counter (MEMADDR)

The address counter is used to store the position of the next consecutive

pixel in the frame Its value is passed to the Pixel Unit on the internal bus

signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

the next pixel from picture memory

The counter is reset to zero when the VGA Controller receives an external

reset signal (RST) The size of the counter depends on the values chosen for

DISPSIZE_H and DISPSIZE_V as the range is simply

12

0 to (PixTotal x LinTotal) - 1

Taking the maximum number of pixels in a line and lines in a frame for each

of the supported display resolutions the maximum ranges for the counter are

bull 640x480 0 to 307199

bull 800x600 0 to 479999

The counter has 19-bit resolution

While the value in the horizontal counter (HCOUNT) is less than the

total number of viewable pixels in a line (PixTotal the integer value of

DISPSIZE_H) the counter is incremented on the rising edge of the external

clock signal (CLK) Pixel addressing within the frame is consecutive When

the counter reaches the last pixel in a line its incremented value is the first

pixel in the next line down

The address counter will continue to be incremented until the value in the

vertical counter (VCOUNT) is greater than or equal to the total number of

viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

point it will be rolled over to zero

13

7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

microcontroller and any VGA-compatible monitor This variant of the

Controller provides six modes of display depending on the resolution chosen

(640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

White 16 Colors or 64 Colors)

71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

For 1sec = 60 frames

1 frame = max 160 sec

1 pixel = 160 sec = 50ns

307200

Max Time for 1 pixel = 40ns

2) To send information for each and every pixel of a frame by means of

RGB signals

3) After completely displaying 1 row information the column must be

refreshed and the row has to be incremented ie done by Hsync signal

4) After completing 1 frame of information a new frame has to be started

for this row and column are to be refreshed ie done by Vsync signal

14

=gt For giving information for 1 pixel we use 1 clock (for controlling all

operations)

1 clock = 1 pixel

1 clock cycle period = 40ns

Clock frequency = 1time period =140ns = 25 mhz

1 clock frequency = 25 mhz

So if we use 25mhz clock and display 1 pixel information for each and enery

clock then we use can attain min 60hz refreshing rate

do not use the clock frequency less than 25mhz

Ex

15MHz 25MHz

30MHz

Time period = more effectively utilized frequency

Less

No of framessec = less

More

Decreases

Increases

(It is unable to display 60framessec)

Signals used for image display-

R (Red)

15

G (Green)

B (Blue)

HSYNC (Horizontal Synchronous signal)

VSYNC (Vertical Synchronous signal)

All the above signals are used to control the image data to be correctly

displayed on the VGA display monitor

72 Timings for front port sync pulse back port-

Horizontal sync Vertical sync

Front port 094micros Front port 035ms

Sync pulse 377 micros Sync pulse 0063ms

Back port 189 micros Back port 102ms

16

8 FUNCTIONAL DESCRIPTION81 Symbol

VGA Controller Symbol

Pin description

The pin description is shown in the below table

Table VGA Pin description

Name Type Polarity Bus size Description

Global Control Signals

CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

17

640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

RST I High Global system reset

VGA Input Settings Signals

RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

Data Memory Control Signals

RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

18

the internal line enable signal en generated by the Synchronization Unit of the Controller

DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

VGA Monitor Control Signals

HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

19

VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

S

B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

20

9 VGA SIGNAL TIMING

Figure 3 summarizes the signal timing involved in sending a line of

pixels and a frame of lines The actual time values differ according to the

resolution selected (640x480 or 800x600) the processor-defined values for

DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

(CLK)

21

91 VGA PIXEL UNIT

The Pixel Unit provides access to the pixilated image stored in external

picture memory reading in data a byte at a time and formatting each pixel to

be displayed For each pixel the 6-bit RGB value required for the control of

the monitorrsquos three primary color electron guns is generated so that the pixel

is displayed on the screen with the correct color

Accessing the stored image

The image to be displayed on the monitor screen is written by the host

microcontroller into external memory space (RAM) This memory space can

be located anywhere (eg using a block of RAM within the FPGA design or

using the SRAM on the NanoBoard itself)

Pixel data is stored in the memory space in bytes The number of pixels in a

byte depends on which color palette is being used for the image (selected by

the CMOD input)

bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

(DATA[76] are not used)

22

The size of memory required to store a picture is determined by the total

number of viewable pixels in a line (determined by DISPSIZE_H) the total

number of viewable lines in a frame (determined by DISPSIZE_V) and the

number of pixels stored in each byte in memory space

Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

The address in RAM where the next pixel is stored is determined using

an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

reflects the current contents of the MEMADDR register The exact addressing

is described below

92 Black amp White mode

The picture memory address ndash the byte of data containing the next 8 pixels

ndash is determined by using bits 183 of hvcnt and right shifting the contents by

three

ADDR_PIXEL = 00 amp hvcnt[183]

16 Colors mode

The picture memory address ndash the byte of data containing the next 2 pixels

ndash is determined by using bits 181 of hvcnt and right shifting the contents by

one

ADDR_PIXEL = 0 amp hvcnt[181]

64 Colors mode

The picture memory address ndash the byte of data containing the next pixel ndash is

determined by using the full value of hvcnt

ADDR_PIXEL = hvcnt[180]

The Pixel register (PIXREG)

The Pixel register is used to receive the byte of pixel data read from the

current address in memory space The register PIXREG is reset to zero when

the VGA Controller receives an external reset signal (RST)

23

The register is updated on each rising edge of the CLK signal Data can be

read from the memory space as long as the RD signal is active (High) The RD

signal is itself controlled by the external line display enable signal enex This

internally generated signal is defined as follows

If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

(LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

Else

enex = 1

When enex is Low read access from memory is disabled (RD = 0)

The point at which data is loaded from memory into PIXREG depends on the

particular color palette that is chosen ndash Black amp White 16 Colors or 64

Colors

Black amp White mode

The next byte of data will be loaded into the register whenever the

lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

are ldquo000rdquo

For the currently loaded byte the active pixel is always in the lowest bit

position of the Pixel register Each pixel in the data byte is moved into this

active pixel position by shifting the contents of the register right by one bit on

each rising edge of CLK

16 Colors mode

The next byte of data will be loaded into the register whenever the

lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

0

For the currently loaded byte the active pixel is always in the low order

nibble of the Pixel register Remember that in this mode each byte of data

contains two pixels The second pixel is moved into this active pixel position

by shifting the contents of the register right by four bits on the rising edge of

CLK

24

64 Colors mode

The next byte of data will be loaded into the register on the rising edge

of the external system clock signal (CLK) In this mode the read of pixel data

does not depend on the status of the hvcnt signal received from the

Synchronization Unit

93 The RGB register (RGBREG)

The RGB register is used to store the six bits that are required for

driving the red green and blue color guns of the target monitor When the

chosen color palette is either Black amp White or 16 Colors these six bits are

obtained by mapping the value of the active pixel to a predefined RGB code

When the chosen color palette is 64 Colors the actual pixel value is used

directly

The register RGBREG is reset to zero (000000) when the VGA

Controller receives an external reset signal (RST) This RGB code represents

black

The register is updated on each rising edge of the CLK signal and the

value loaded is dependent on the state of the line display enable signal en

When en is Low blanking is required and RGBREG is loaded with the code

for black (000000)

Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

and a 4-bit pixel (16 Colors mode) into the required RGB color code

25

26

27

The RGB color code stored in the RGB register is output from the VGA

Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

and B1)

The monitor itself expects analog signals as inputs to its electron gun control

circuits This is achieved by using 2-bit digital to analog converter circuitry

located on the Nano Board itself as shown in Figure 4

Figure 91 Digital to Analog RGB conversion

28

For each color the 2-bit digital signal from the VGA Controller can be

converted into 4 distinct analog levels These levels specify the intensity of

each of the three primary colors to use when displaying the pixel on the

monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

brightness)

With each analog input being one of four possible levels the monitor

can display each pixel on the screen with one of 64 different color

permutations

Using Block RAM in Spartan-3E Generation

FPGAs

For applications requiring large on-chip memories Spartantrade-3

Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

Using various configuration options Select RAM blocks create RAM ROM

FIFOs large look-up tables data width converterscircular buffers and shift

registers each supporting various data widths and depths This application note

describes the features and capabilities of block Select RAM and illustrates how

to specify the various options using the Xilinx CORE Generatortrade system or

via VHDL or Verilog instantiation

29

Table 2 Block RAM Available in Spartan-3E Devices

Devi

ce

R

A

M

Co

lu

m

ns

R

A

M

Bl

oc

ks

pe

r

C

ol

u

m

n

T

o

t

a

l

R

A

M

B

l

o

c

k

s

T

o

t

a

l

R

A

M

B

it

s

T

o

t

a

l

R

A

M

K

b

i

t

s

XC3

S100

E

1 4 4 7

3

7

2

8

7

2

XC3

S250

E

2 6 1

2

2

2

1

1

8

4

2

1

6

XC3

S500

E

2 10 2

0

3

6

8

6

4

3

6

0

30

0

XC3

S120

0E

2 14 2

8

5

1

6

0

9

6

5

0

4

XC3

S160

0E

2 18 3

6

6

6

3

5

5

2

6

4

8

Each block RAM contains 18432 bits of fast static RAM 16K bits of

which is allocated to data storage and in some memory configurations an

additional 2K bits allocated to parity or additional plus data bits Physically

the block RAM memory has two completely independent access ports labeled

Port A and Port B The structure is fully symmetrical and both ports are

interchangeable and both ports support data read and write operations Each

memory port is synchronous with its own clock clock enable and write

enable Read operations are also synchronous and require a clock edge and

clock enable Though physically a dual-port memory block RAM simulates

single-port memory in an application as shown in Figure 1 Furthermore each

block memory supports multiple configurations or aspect ratios Table 3

summarizes the essential SelectRAM features Cascade multiple block RAMs

to create deeper and wider memory organizations with a minimal timing

penalty incurred through specialized routing resources

31

32

10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

Why (V) HDL

Interoperability

Technology independence

Design reuse

Several levels of abstraction

Readability

Standard language

Widely supported

What is VHDL

VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

IC)

Design specification language

Design entry language

Design simulation language

Design documentation language

An alternative to schematics

101 Brief History

VHDL Was developed in the early 1980s for managing design problems that

involved large circuits and multiple teams of engineers

Funded by US Department of Defence

33

The first publicly available version was released in 1985

In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

presented with a proposal to standardize the VHDL

In 1987 standardization =gt IEEE 1076-1987

An improved version of the language was released in 1994 =gt IEEE

standard1076-1993

102 Related Standards

IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

impedance

Soon after IEEE 1076-1987 was released simulator companies began using

their own non-standard types =gt VHDL was becoming a nonstandard

IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

definitions for a nine-valued data type std_logic

IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

to actual hardware

Defines eg two numeric types signed and unsigned

VHDL Environment

Design Units

34

Segments of VHDL code that can be compiled separately and stored in a

library

Entities

A black box with interface definition

Defines the inputsoutputs of a component (define pins)

A way to represent modularity in VHDL

Similar to symbol in schematic

Entity declaration describes entity

Eg

entity Comparator is

port (A B in std_logic_vector(7 downto0)

EQ out std_logic)

end Comparator

Ports Provide channels of communication between the component and its

environment

Each port must have a name direction and a type

An entity may have NO port declaration

35

Port directions In A value of a port can be read inside the component but cannot be assigned

Multiple reads of port are allowed

Out Assignments can be made to a port but data from a port cannot be read

Multiple assignments are allowed

In out Bi-directional assignments can be made and data can be read Multiple

assignments are allowed

Buffer An out port with read capability May have at most one assignment

(are not recommended)

Architectures Every entity has at least one architecture

One entity can have several architectures

Architectures can describe design using

BehaviorndashStructurendashDataflow

Architectures can describe design on many levelsndashGate levelndashRTL (Register

Transfer Level)ndashBehavioral level

Configuration declaration links architecture to entity

Eg

Architecture Comparator1 of Comparator is

Begin

EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

End Comparator1

Configurations Links entity declaration and architecture body together

Concept of default configuration is a bit messy in VHDL lsquo87

ndashLast architecture analyzed links to entity

Can be used to change simulation behavior without re-analyzing the VHDL

source

Complex configuration declarations are ignored in synthesis

36

Some entities can have eggate level architecture and behavioral

architecture

Are always optional

PackagesPackages contain information common to many design units

1 Package declaration

--constant declarations

ndashtype and subtype declarations

ndashfunction and procedure declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

2 Package body

ndashis not necessary needed

ndashfunction bodies

ndashprocedure bodies

Packages are meant for encapsuling data which can be shared globally among

Several design units

Consists of declaration part and optional body part

Package declaration can contain

ndashtype and subtype declarations

ndashsubprograms

ndashconstants

Alias declarations

ndashglobal signal declarations

ndashfile declarations

ndashcomponent declarations

37

Package body consists of

ndashsubprogram declarations and bodies

ndashtype and subtype declarations

ndash deferred constants

ndash file declarations

LibrariesCollection of VHDL design units (database)

1 Packages

package declaration

package body

2 Entities (entity declaration)

3 Architectures (architecture body)

4 Configurations (configuration declarations)

Usually directory in UNIX file system

Can be also any other kind of database

Levels of Abstraction

VHDL supports many possible styles of design description which differ

primarily in how closely they relate to the HW

It is possible to describe a circuit in a number of ways

Structural-------

Dataflow ------- Higher level of abstraction

Behavioral -------

Structural VHDL description

Circuit is described in terms of its components

From a low-level description (eg transistor-level description) to a high level

description (eg block diagram)

For large circuits a low-level description quickly becomes impractical

103 Dataflow VHDL Description

38

Circuit is described in terms of how data moves through the system

In the dataflow style you describe how information flows between registers in

the system

The combinational logic is described at a relatively high level the placement

and operation of registers is specified quite precisely

The behavior of the system over the time is defined by registers

There are no build-in registers in VHDL-language

ndashEither lower level description

ndashor behavioral description of sequential elements is needed

The lower level register descriptions must be created or obtained

If there is no 3rd party models for registers =gt you must write the behavioral

description of registers

The behavioral description can be provided in the form of

subprograms(functions or procedures)

104 Behavioral VHDL Description

Circuit is described in terms of its operation over time

39

Representation might include eg state diagrams timing diagrams and

algorithmic descriptions

The concept of time may be expressed precisely using delays (eg A lt= B

after 10 ns)

If no actual delay is used order of sequential operations is defined

In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

timing specifications

The actual timing results depend on implementation technology and efficiency

of synthesis tool

There are a few tools for behavioral synthesis

Concurrent Vs Sequential

Processes

Basic simulation concept in VHDL

VHDL description can always be broken up to interconnected processes

Quite similar to Unix process

40

Process keyword in VHDL

Process statement is concurrent statement

Statements inside process statements are sequential statements

Process must contain either sensitivity list or wait statement(s) but NOT both

Sensitivity list or wait statement(s) contains signals which wakes process up

General format

Process [(sensitivity_list)]

process_declarative_part

Begin

process_statements

[wait_statement]

End process

41

11 VGA CONTROLLER CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

use IEEEnumeric_stdALL

---- Uncomment the following library declaration if instantiating

---- any Xilinx primitives in this code

--library UNISIM

--use UNISIMVComponentsall

entity vga_controller is port(clk50min std_logic

rstnin std_logic

hsyncout std_logic

vsyncout std_logic

redout std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0))

end vga_controller

architecture Behavioral of vga_controller is

component memory

port (

addr IN std_logic_VECTOR(17 downto 0)

clk IN std_logic

dout OUT std_logic_VECTOR(0 downto 0)

en IN std_logic

sinit IN std_logic)

END component

42

component sync_unit port(

clkin std_logic

rstnin std_logic

hsyncout std_logic

---horizontal synch pulse(row)

vsyncout std_logic

--vertical sync pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countinout integer

c_hsinout integer)

end component

component pixelgen port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic)

end component

43

signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

signal line_count_sc_hs_saddrinteger

signal data_sstd_logic_vector(0 downto 0)

signal addr_sstd_logic_vector(17 downto 0)

begin

--addr_s lt= std_logic_vector(to_unsigned(addr18))

process(clk50mrstn)

begin

if(rstn = 0)then

clk lt= 0

elsif(clk50m = 1 and clk50mevent)then

clk lt= not clk

end if

--end if

end process

hsync lt= hsync_s

vsync lt= vsync_s

sync sync_unit port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

v_en_reg =gt v_en_reg_s

line_count =gt line_count_s

h_en =gt h_en_s

v_en =gt v_en_s

c_hs =gt c_hs_s

)

44

pixel pixelgen port map(

clk =gt clk

rstn =gt rstn

hsync =gt hsync_s

vsync =gt vsync_s

h_en =gt h_en_s

v_en =gt v_en_s

v_en_reg =gt v_en_reg_s

data_in =gt data_s

line_count=gt line_count_s

c_hs =gt c_hs_s

red =gt red

green =gt green

blue =gt blue

addr =gt addr_s

rd =gt rd_s

)

mem12 memory port map(

addr =gt addr_s

clk =gt clk

dout =gt data_s

en =gt rd_s

sinit=gt rstn

)

end Behavioral

45

SYNCHRONIZATION BLOCK CODE

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity sync_unit is port(

clkin std_logic

rstnin std_logic

hsyncout std_logic ---horizontal synch pulse(row)

vsyncout std_logic --vertical sync

pulse(frame) 111110100000000000

v_en_regout std_logic

h_enout std_logic

v_enout std_logic

line_countout integer

c_hsout integer )

end sync_unit

architecture arch_sync_unit of sync_unit is

signal h_en_sv_en_sstd_logic

signal line_count_sc_hs_sinteger

begin

c_hs lt= c_hs_s

line_count lt= line_count_s

h_en lt= h_en_s

v_en lt= v_en_s

-- hsync counter

process(clkrstn)

begin

if(rstn = 0)then

46

c_hs_s lt= 0

else

if(clk = 1 and clkevent)then

if(c_hs_s=793)then

c_hs_s lt= 0

else

c_hs_s lt= c_hs_s +1

end if

end if

end if

end process

--vsync counter

--vertical line counter

process(h_en_srstn)

begin

if(rstn = 0)then

line_count_s lt= 0

elsif(h_en_s = 1 and h_en_sevent)then

line_count_s lt= line_count_s + 1

if (line_count_s=515)then

line_count_s lt= 0

end if

end if

--end if

end process

--hysnc pulse

process(clkrstn)

begin

if(rstn = 0)then

47

hsync lt= 1

elsif(clk = 1 and clkevent)then

if(c_hs_s lt=95)then

hsync lt= 0

else

hsync lt= 1

end if

end if

--end if

end process

process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

this period)

begin

if(rstn = 0)then

h_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((c_hs_s gt=143) and (c_hs_slt=778))then

h_en_s lt= 1

else

h_en_s lt= 0

end if

end if

--end if

end process

---vysnc pulse

process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

begin

if(rstn = 0)then

48

v_en_s lt= 0

elsif(clk = 1 and clkevent)then

if((line_count_sgt=35 and line_count_slt=515))then

v_en_s lt= 1

else

v_en_s lt= 0

end if

end if

--end if

end process

process(clkrstn) --Vertical Sync Front porch Back Porch

begin

if(rstn = 0)then

vsync lt= 1

elsif(clk = 1 and clkevent)then

if(line_count_s lt2)then

vsync lt= 0

else

vsync lt= 1

end if

end if

--end if

end process

process(clk)

begin

if(clk = 1 and clkevent)then

v_en_reglt=v_en_s

end if

end process

end arch_sync_unit

49

PIXEL GENERATOR

library IEEE

use IEEESTD_LOGIC_1164ALL

use IEEESTD_LOGIC_ARITHALL

use IEEESTD_LOGIC_UNSIGNEDALL

entity pixelgen is port(

clkin std_logic

rstnin std_logic

hsyncin std_logic

vsyncin std_logic

h_enin std_logic

v_enin std_logic

v_en_reg in std_logic

data_inin std_logic_vector(0 downto 0)

line_countin integer

c_hsin integer

red out std_logic_vector(0 downto 0)

greenout std_logic_vector(0 downto 0)

blueout std_logic_vector(0 downto 0)

addrout std_logic_vector(17 downto 0)

rdout std_logic

)

end pixelgen

architecture behv of pixelgen is

signal addr_sstd_logic_vector(17 downto 0)

begin

addr lt= addr_s

process(clkrstn)

begin

50

if(rstn = 0)then

rd lt= 0

red lt= 0

green lt= 0

blue lt= 0

addr_s lt= x0000 amp 00

elsif (clk = 1 and clkevent)then

if((v_en=1) and (v_en_reg=0))then

addr_slt=(others =gt0)

end if

if(v_en=1 and h_en=1)then

if(line_countlt=514)then

if(c_hs lt= 655)then

rd lt= 1

red lt= data_in

green lt= data_in

blue lt= data_in

addr_s lt=addr_s +000000000000000001

else

rd lt= 0

red lt= 0

green lt= 1

blue lt= 0

end if

else

rd lt= 0

red lt= 1

green lt= 1

blue lt= 1

51

end if

if(addr_s=111011111111111111)then

addr_slt= (others =gt0)

end if

else

red lt= 0

green lt= 0

blue lt= 0

end if

end if

-- end if

--end if

end process

end behv

52

12 FPGA ARCHITECTURE

A field-programmable gate array (FPGA) is a semiconductor device that can

be configured by the customer or designer after manufacturingmdashhence the

name field-programmable FPGAs are programmed using a logic circuit

diagram or a source code in a hardware description language (HDL) to specify

how the chip will work They can be used to implement any logical function

that an application-specific integrated circuit (ASIC) could perform but the

ability to update the functionality after shipping offers advantages for many

applications

FPGAs contain programmable logic components called logic blocks and a

hierarchy of reconfigurable interconnects that allow the blocks to be wired

togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

can be configured to perform complex combinational functions or merely

simple logic gates like AND and XOR In most FPGAs the logic blocks also

include memory elements which may be simple flip-flops or more complete

blocks of memory

History The FPGA industry sprouted from programmable read only

memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

both had the option of being programmed in batches in a factory or in the field

(field programmable) however programmable logic was hard-wired between

logic gates

Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

first commercially viable field programmable gate array in 1985 ndash the

XC2064 The XC2064 had programmable gates and programmable

interconnects between gates the beginnings of a new technology and market

The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

input lookup tables (LUTs) More than 20 years later Freeman was entered

into the National Inventors Hall of Fame for his invention

53

Some of the industryrsquos foundational concepts and technologies for

programmable logic arrays gates and logic blocks are founded in patents

awarded to David W Page and LuVerne R Peterson in 1985

In the late 1980s the Naval Surface Warfare Department funded an experiment

proposed by Steve Casselman to develop a computer that would implement

600000 reprogrammable gates Casselman was successful and the system was

awarded a patent in 1992

Xilinx continued unchallenged and quickly growing from 1985 to the mid-

1990s when competitors sprouted up eroding significant market-share By

1993 Actel was serving about 18 percent of the market

The 1990s were an explosive period of time for FPGAs both in sophistication

and the volume of production In the early 1990s FPGAs were primarily used

in telecommunications and networking By the end of the decade FPGAs

found their way into consumer automotive and industrial applications

FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

algorithm technology and FPGAs to create a sound recognition device

Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

to decide the configuration needed to accomplish a sound recognition task

121 Modern developments

A recent trend has been to take the coarse-grained architectural

approach a step further by combining the logic blocks and interconnects of

traditional FPGAs with embedded microprocessors and related peripherals to

form a complete system on a programmable chip This work mirrors the

architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

Group which combined a reconfigurable CPU architecture on a single chip

called the SB24 That work was done in 1982 Examples of such hybrid

technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

which include one or more PowerPC processors embedded within the FPGAs

54

logic fabric The Atmel FPSLIC is another such device which uses an AVR

processor in combination with Atmels programmable logic architecture

An alternate approach to using hard-macro processors is to make use of soft

processor cores that are implemented within the FPGA logic (See Soft

processors below)

As previously mentioned many modern FPGAs have the ability to be

reprogrammed at run time and this is leading to the idea of reconfigurable

computing or reconfigurable systems mdash CPUs that reconfigure themselves to

suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

example of a reconfigurable soft processor implemented on FPGAs However

it does not support dynamic reconfiguration at runtime but instead adapts itself

to a specific program

Additionally new non-FPGA architectures are beginning to emerge

Software-configurable microprocessors such as the Stretch S5000 adopt a

hybrid approach by providing an array of processor cores and FPGA-like

programmable cores on the same chip

Gates

1987 9000 gates Xilinx

1992 600000 Naval Surface Warfare Department

Early 2000s Millions

Market size

1985 First commercial FPGA technology invented by Xilinx

1987 $14 million

~1993 gt$385 million

2005 $19 billion

2010 estimates $275 billion

55

122 FPGA Comparisons

Historically FPGAs have been slower less energy efficient and generally

achieved less functionality than their fixed ASIC counterparts A combination

of volume fabrication improvements research and development and the IO

capabilities of new supercomputers have largely closed the performance gap

between ASICs and FPGAs

Advantages include a shorter time to market ability to re-program in the field

to fix bugs and lower non-recurring engineering costs Vendors can also take a

middle road by developing their hardware on ordinary FPGAs but

manufacture their final version so it can no longer be modified after the design

has been committed

Xilinx claims that several market and technology dynamics are changing the

ASICFPGA paradigm

IC costs are rising aggressively

ASIC complexity has bolstered development time and costs

RampD resources and headcount is decreasing

Revenue losses for slow time-to-market are increasing

Financial constraints in a poor economy are driving low-cost technologies

These trends make FPGAs a better alternative than ASICs for a growing

number of higher-volume applications than they have been historically used

for which the company blames for the growing number of FPGA design starts

(see History)

The primary differences between CPLDs and FPGAs are architectural A

CPLD has a somewhat restrictive structure consisting of one or more

programmable sum-of-products logic arrays feeding a relatively small number

of clocked registers The result of this is less flexibility with the advantage of

more predictable timing delays and a higher logic-to-interconnect ratio The

FPGA architectures on the other hand are dominated by interconnect This

56

makes them far more flexible (in terms of the range of designs that are

practical for implementation within them) but also far more complex to design

for

Another notable difference between CPLDs and FPGAs is the presence in most

FPGAs of higher-level embedded functions (such as adders and multipliers)

and embedded memories as well as to have logic blocks implement decoders

or mathematical functions

Some FPGAs have the capability of partial re-configuration that lets one

portion of the device be re-programmed while other portions continue running

123 Applications

Applications of FPGAs include digital signal processing software-defined

radio aerospace and defense systems ASIC prototyping medical imaging

computer vision speech recognition cryptography bioinformatics computer

hardware emulation radio astronomy and a growing range of other areas

FPGAs originally began as competitors to CPLDs and competed in a similar

space that of glue logic for PCBs As their size capabilities and speed

increased they began to take over larger and larger functions to the state where

some are now marketed as full systems on chips (SoC) Particularly with the

introduction of dedicated multipliers into FPGA architectures in the late 1990s

applications which had traditionally been the sole reserve of DSPs began to

incorporate FPGAs instead

FPGAs especially find applications in any area or algorithm that can make use

of the massive parallelism offered by their architecture One such area is code

breaking in particular brute-force attack of cryptographic algorithms

FPGAs are increasingly used in conventional high performance computing

applications where computational kernels such as FFT or Convolution are

performed on the FPGA instead of a microprocessor

57

The inherent parallelism of the logic resources on an FPGA allows for

considerable computational throughput even at a low MHz clock rates The

flexibility of the FPGA allows for even higher performance by trading off

precision and range in the number format for an increased number of parallel

arithmetic units This has driven a new type of processing called

reconfigurable computing where time intensive tasks are offloaded from

software to FPGAs

The adoption of FPGAs in high performance computing is currently limited by

the complexity of FPGA design compared to conventional software and the

extremely long turn-around times of current design tools where 4-8 hours wait

is necessary after even minor changes to the source code

Traditionally FPGAs have been reserved for specific vertical applications

where the volume of production is small For these low-volume applications

the premium that companies pay in hardware costs per unit for a

programmable chip is more affordable than the development resources spent

on creating an ASIC for a low-volume application Today new cost and

performance dynamics have broadened the range of viable applications

Architecture

The most common FPGA architecture consists of an array of configurable

logic blocks (CLBs) IO pads and routing channels Generally all the routing

channels have the same width (number of wires) Multiple IO pads may fit

into the height of one row or the width of one column in the array

An application circuit must be mapped into an FPGA with adequate resources

While the number of CLBs and IOs required is easily determined from the

design the number of routing tracks needed may vary considerably even

among designs with the same amount of logic (For example a crossbar switch

requires much more routing than a systolic array with the same gate count)

Since unused routing tracks increase the cost (and decrease the performance)

of the part without providing any benefit FPGA manufacturers try to provide

58

just enough tracks so that most designs that will fit in terms of LUTs and IOs

can be routed This is determined by estimates such as those derived from

Rents rule or by experiments with existing designs

The FPGA is an array or island-style FPGA It consists of an array of logic

blocks and routing channels Two IO pads fit into the height of one row or the

width of one column as shown below All the routing channels have the same

width (number of wires)

121 FPGA structure

A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

flip-flop as shown below In recent years manufacturers have started moving

to 6-input LUTs in their high performance parts claiming increased

performance

Typical logic block

59

There is only one output which can be either the registered or the unregistered

LUT output The logic block has four inputs for the LUT and a clock input

Since clock signals (and often other high-fanout signals) are normally routed

via special-purpose dedicated routing networks in commercial FPGAs they

and other signals are separately managed

For this example architecture the locations of the FPGA logic block pins are

shown below

Logic Block Pin Locations

Each input is accessible from one side of the logic block while the output pin

can connect to routing wires in both the channel to the right and the channel

below the logic block

Each logic block output pin can connect to any of the wiring segments in the

channels adjacent to it

Similarly an IO pad can connect to any one of the wiring segments in the

channel adjacent to it For example an IO pad at the top of the chip can

connect to any of the W wires (where W is the channel width) in the horizontal

channel immediately below it

Generally the FPGA routing is unsegmented That is each wiring segment

spans only one logic block before it terminates in a switch box By turning on

60

some of the programmable switches within a switch box longer paths can be

constructed For higher speed interconnect some FPGA architectures use

longer routing lines that span multiple logic blocks

Whenever a vertical and a horizontal channel intersect there is a switch box In

this architecture when a wire enters a switch box there are three

programmable switches that allow it to connect to three other wires in adjacent

channel segments The pattern or topology of switches used in this

architecture is the planar or domain-based switch box topology In this switch

box topology a wire in track number one connects only to wires in track

number one in adjacent channel segments wires in track number 2 connect

only to other wires in track number 2 and so on The figure below illustrates

the connections in a switch box

Switch box topology

Modern FPGA families expand upon the above capabilities to include higher

level functionality fixed into the silicon Having these common functions

embedded into the silicon reduces the area required and gives those functions

increased speed compared to building them from primitives Examples of these

include multipliers generic DSP blocks embedded processors high speed IO

logic and embedded memories

61

FPGAs are also widely used for systems validation including pre-silicon

validation post-silicon validation and firmware development This allows chip

companies to validate their design before the chip is produced in the factory

reducing the time to market

124 FPGA Design and Programming

To define the behavior of the FPGA the user provides a hardware description

language (HDL) or a schematic design The HDL form might be easier to work

with when handling large structures because its possible to just specify them

numerically rather than having to draw every piece by hand On the other hand

schematic entry can allow for easier visualisation of a design

Then using an electronic design automation tool a technology-mapped netlist

is generated The netlist can then be fitted to the actual FPGA architecture

using a process called place-and-route usually performed by the FPGA

companys proprietary place-and-route software The user will validate the

map place and route results via timing analysis simulation and other

verification methodologies Once the design and validation process is

complete the binary file generated (also using the FPGA companys

proprietary software) is used to (re)configure the FPGA

Going from schematicHDL source files to actual configuration The source

files are fed to a software suite from the FPGACPLD vendor that through

different steps will produce a file This file is then transferred to the

FPGACPLD via a serial interface (JTAG) or to an external memory device

like an EEPROM

The most common HDLs are VHDL and Verilog although in an attempt to

reduce the complexity of designing in HDLs which have been compared to the

equivalent of assembly languages there are moves to raise the abstraction level

through the introduction of alternative languages

62

To simplify the design of complex systems in FPGAs there exist libraries of

predefined complex functions and circuits that have been tested and optimized

to speed up the design process These predefined circuits are commonly called

IP cores and are available from FPGA vendors and third-party IP suppliers

(rarely free and typically released under proprietary licenses) Other

predefined circuits are available from developer communities such as Open

Cores (typically free and released under the GPL BSD or similar license) and

other sources

In a typical design flow an FPGA application developer will simulate the

design at multiple stages throughout the design process Initially the RTL

description in VHDL or Verilog is simulated by creating test benches to

simulate the system and observe results Then after the synthesis engine has

mapped the design to a netlist the netlist is translated to a gate level

description where simulation is repeated to confirm the synthesis proceeded

without errors Finally the design is laid out in the FPGA at which point

propagation delays can be added and the simulation run again with these

values back-annotated onto the netlist

Basic Process Technology Types

SRAM - based on static memory technology In-system programmable and re-

programmable Requires external boot devices CMOS

Antifuse - One-time programmable CMOS

EPROM - Erasable Programmable Read-Only Memory technology Usually

one-time programmable in production because of plastic packaging

Windowed devices can be erased with ultraviolet (UV) light CMOS

EEPROM - Electrically Erasable Programmable Read-Only Memory

technology Can be erased even in plastic packages Some but not all

EEPROM devices can be in-system programmed CMOS

Flash - Flash-erase EPROM technology Can be erased even in plastic

packages Some but not all flash devices can be in-system programmed

63

Usually a flash cell is smaller than an equivalent EEPROM cell and is

therefore less expensive to manufacture CMOS

Fuse - One-time programmable Bipolar

Major Manufacturers

Xilinx and Altera are the current FPGA market leaders and long-time industry

rivals Together they control over 80 percent of the market with Xilinx alone

representing over 50 percent

Xilinx also provides free Windows and Linux design software while Altera

provides free Windows tools the Solaris and Linux tools are only available via

a rental scheme

Other competitors include Lattice Semiconductor (flash SRAM) Actel

(antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

focused CSSP no general purpose FPGAs)

125 FPGA prototype

FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

prototyping is the method to prototype SoC and ASIC design on FPGA for

hardware verification and early software development

Main stream verification methods for hardware design and early software and

firmware co-design has become mainstream Prototyping SoC and ASIC

design on FPGA has become a good method to do this

64

Reasons why Prototyping is important

1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

functionally correct This is compared to designers only relying on software

simulations to verify that their hardware design is sound Simulation speed and

modeling accuracy limitations hinder this development

2 Due to time constrains many projects cannot wait until the silicon is back from

the foundry to start on software tests FPGA prototyping allows for much more

time in area of software development and testing at the software-hardware

integration stage This allows many unforeseen software bugs that appear due

to todays array of operating systems applications and hardware

3 Prototyping also allows the developer to ensure that all IP technologies on his

system work well together off the simulation stage and in actual form

4 Prototyping has the added advantage as demo platforms to SoC clients

bringing in interest early This speeds up the overall development cycle and

allows for more enhancement or improvement to the chip features as it would

otherwise have been

65

13 SIMULATION RESULTS

66

GATE LEVEL

FigVGA CONTROLLER

67

GATE LEVEL

Fig SYNC UNIT

68

TECHNOLOGY SCHEMATIC

69

14 APPLICATIONS

Motion Tracking Systems

Image processing systems

Displaying Systems

141 CONCLUTION

Hence by implementing FPGA taking a processor-generated picture

(pixilated) from memory space the Controller provides digital RGB values for

each pixel as well as horizontal and vertical synchronization signals in order

to correctly display the picture on a connected monitor

70

REFERENCES

1 httpenwikipediaorgwikiVideo_Graphics_Array

2 A VGA display controller by Eduardo Sanchez

3 HinnerrdquoVGA Interface and video signal documentsrdquo

71

  • Pixel
    • 6 SIGNALS IN VGA CONTROLLER
    • Gates
    • Market size
    • 122 FPGA Comparisons

    2 HISTORY

    The term Video Graphics Array (VGA) refers specifically to the display

    hardware first introduced with the IBM PS2 line of computers in 1987 but

    through its widespread adoption has also come to mean either an analog

    computer display standard the 15-pin D-subminiature VGA connector or the

    640times480 resolution itself While this resolution has been superseded in the

    personal computer market it is becoming a popular resolution on mobile

    devices

    Video Graphics Array (VGA) was the last graphical standard introduced by

    IBM that the majority of PC clone manufacturers conformed to making it

    today (as of 2009) the lowest common denominator that all PC graphics

    hardware supports before a device-specific driver is loaded into the computer

    For example the MS-Windows splash screen appears while the machine is still

    operating in VGA mode which is the reason that this screen always appears in

    reduced resolution and color depth

    VGA was officially superseded by IBMs XGA standard but in reality it was

    superseded by numerous slightly different extensions to VGA made by clone

    manufacturers that came to be known collectively as Super VGA

    21 VGA compared to other standard resolutions

    VGA is referred to as an array instead of an adapter because it was

    implemented from the start as a single chip (an ASIC) replacing the Motorola

    6845 and dozens of discrete logic chips that covered the full-length ISA boards

    of the MDA CGA and EGA Its single-chip implementation also allowed the

    VGA to be placed directly on a PCs motherboard with a minimum of

    difficulty (it only required video memory timing crystals and an external

    RAMDAC) and the first IBM PS2 models were equipped with VGA on the

    motherboard

    2

    22 The VGA specifications are as follows

    256 KB Video RAM (The very first cards could be ordered with 64KB or

    128KB of RAM at the cost of losing some video modes)

    16-color and 256-color modes

    262144-value color palette (six bits each for red green and blue)

    Selectable 25175 MHz or 28322 MHz master clock

    Maximum of 800 horizontal pixels

    Maximum of 600 lines

    Refresh rates at up to 70 Hz

    Vertical blank interrupt

    Packed-pixel mode 256 colors

    Hardware smooth scrolling support

    The VGA supports both All Points Addressable graphics modes and

    alphanumeric text modes Standard graphics modes are

    640times480 in 16 colors

    640times350 in 16 colors

    320times200 in 16 colors

    320times200 in 256 colors

    3

    3 DESCRIPTION ARCHITECTURE

    31 VGA display device

    VGA display device used for displaying the images taken from the

    processor in an exact format It can be used for displaying the output of any

    projects Analog graphs etchellip can be directly displayed on the VGA display

    monitor

    31 VGA (Video Graphics Array) Controller

    VGA controller controls the image to be exactly displayed on the display

    device It acts as an interface between processing system and display device so

    called as interface controller

    4

    Image or video processing system

    VGA Controller

    Display

    4 CONTROLLER DESIGN

    For this first we should know how the display device works Consider an

    image below

    640x480

    Original image

    5

    Pixels of image

    Display device also have pixels Display resolution (640480) is a

    standard resolution of display

    Pixel

    This article is about the picture element For other uses see Pixel

    (disambiguation)

    This example shows an image with a portion greatly enlarged in which the

    individual pixels are rendered as little squares and can easily be seen

    6

    42 A photograph of sub-pixel display elements on a laptops LCD screen

    In digital imaging a pixel (or picture element [1]) is a single point in a raster

    image The pixel is the smallest addressable screen element it is the smallest

    unit of picture which can be controlled Each Pixel has its address The address

    of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

    dimensional grid and are often represented using dots or squares Each pixel is

    a sample of an original image where more samples typically provide more-

    accurate representations of the original The intensity of each pixel is variable

    In color image systems a color is typically represented by three or four

    component intensities such as red green and blue or cyan magenta yellow

    and black

    Color refreshing rate

    Color refreshing rate =60Hz

    Color refreshing rate represents the number of frames that are transmitted per

    second

    Minimum number of frames that can be transmitted per second by VGA

    controller is 60

    7

    5 Hardware Description

    51 Block Diagram

    51 VGA Synchronization Unit

    8

    The Synchronization Unit provides the horizontal and vertical

    synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

    display a picture frame within the confines of a monitorrsquos display area

    These synchronization signals are used as control inputs by the

    monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

    electrons emitted by the three primary color electron guns (Red Green Blue)

    left to right and from top to bottom respectively HSYNC provides the start

    and stop times for the horizontal deflection circuit so that a line of pixels is

    correctly drawn across the screen display VSYNC provides the start and stop

    times for the vertical deflection circuit so that the lines of a frame are correctly

    drawn from the top to the bottom of the screen display

    The resolution for the display is defined by the level on the

    RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

    the 800x600 resolution (SVGA) is used

    Although the resolution determines the area of a monitorrsquos screen

    within which an image can be displayed the full extents of the chosen

    resolution do not have to be used The actual extents of the image display area

    on the screen can be controlled by the use of the DISPSIZE_H and

    DISPSIZE_V inputs These inputs determine the total number of pixels to be

    used in a line and the total number of lines in a frame respectively

    52 Blank pixel generation

    The total number of viewable pixels in a line and viewable lines in a

    frame is determined by the display resolution chosen through the

    RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

    the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

    chosen display resolution is used or not the areas of the monitor screen to the

    top bottom left and right of the viewable frame area are blanked by putting

    black pixels at the required line-pixel positions This has the effect of centering

    the image on the screen

    9

    The color generated for a pixel in the Pixel Unit depends on whether

    the particular pixel requires to be blanked or not The Synchronization Unit

    provides a signal to the Pixel Unit for this very reason This is the line display

    enable signal - en The signal is checked on each rising edge of the external

    clock signal (CLK) and is set as follows

    If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

    en = 0 (pixel requires to be blanked ndash set color to be black)

    Else

    en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

    6 SIGNALS IN VGA CONTROLLER

    For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

    timings are

    Parameter Value Unit

    Clock frequency 25175 MHz

    Horizontal pixels 640

    Horizontal sync polarity Negative

    Total time for each line 3177 micros

    Front porch (A) 094 micros

    Sync pulse length (B) 377 micros

    Back porch (C) 189 micros

    Active video (D) 2517 micros

    (Total horizontal sync time 660 micros)

    10

    61 The vertical timings are

    Parameter Value Unit

    Vertical lines 480

    Vertical sync polarity Negative

    Vertical frequency 5994 Hz

    Front porch (E) 035 ms

    Sync pulse length (F) 006 ms

    Back porch (G) 102 ms

    Active video (H) 1525 ms

    (Total vertical sync time 143 ms)

    640 x 400 70 Hz is video mode used for booting most x86 personal

    computers

    640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

    It should be noted that the actual timings vary slightly For example for

    640x480 60fps a 2517 micros active video time with a pixel frequency of

    25174 MHz gives 633 pixels rather than the expected 640 pixels

    62 Generation of the horizontal synchronization signal ndash HSYNC

    11

    The HSYNC signal is High (inactive) after an external reset signal

    (RST) is received by the VGA Controller The signal is updated on each rising

    edge of the external clock signal (CLK)

    The state of the HSYNC signal depends on the value stored in the horizontal

    counter and is driven low when

    HCOUNT ge (PixTotal + BlankingLeft)

    and remains low while

    HCOUNT lt (PixTotal + BlankingLeft + TLSync)

    Vertical (Frame) Period

    63 Generation of the vertical synchronization signal - VSYNC

    The VSYNC signal is High (inactive) after an external reset signal

    (RST) is received by the VGA Controller The signal is updated after every

    line of pixels is completed (ie on each rising edge of the HSYNC signal)

    The state of the VSYNC signal depends on the value stored in the vertical

    counter and is driven low when

    VCOUNT ge (LinTotal + BlankingLeft)

    and remains low while

    VCOUNT lt (LinTotal + BlankingLeft + TFSync)

    Address Counter (MEMADDR)

    The address counter is used to store the position of the next consecutive

    pixel in the frame Its value is passed to the Pixel Unit on the internal bus

    signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

    the next pixel from picture memory

    The counter is reset to zero when the VGA Controller receives an external

    reset signal (RST) The size of the counter depends on the values chosen for

    DISPSIZE_H and DISPSIZE_V as the range is simply

    12

    0 to (PixTotal x LinTotal) - 1

    Taking the maximum number of pixels in a line and lines in a frame for each

    of the supported display resolutions the maximum ranges for the counter are

    bull 640x480 0 to 307199

    bull 800x600 0 to 479999

    The counter has 19-bit resolution

    While the value in the horizontal counter (HCOUNT) is less than the

    total number of viewable pixels in a line (PixTotal the integer value of

    DISPSIZE_H) the counter is incremented on the rising edge of the external

    clock signal (CLK) Pixel addressing within the frame is consecutive When

    the counter reaches the last pixel in a line its incremented value is the first

    pixel in the next line down

    The address counter will continue to be incremented until the value in the

    vertical counter (VCOUNT) is greater than or equal to the total number of

    viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

    point it will be rolled over to zero

    13

    7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

    microcontroller and any VGA-compatible monitor This variant of the

    Controller provides six modes of display depending on the resolution chosen

    (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

    White 16 Colors or 64 Colors)

    71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

    For 1sec = 60 frames

    1 frame = max 160 sec

    1 pixel = 160 sec = 50ns

    307200

    Max Time for 1 pixel = 40ns

    2) To send information for each and every pixel of a frame by means of

    RGB signals

    3) After completely displaying 1 row information the column must be

    refreshed and the row has to be incremented ie done by Hsync signal

    4) After completing 1 frame of information a new frame has to be started

    for this row and column are to be refreshed ie done by Vsync signal

    14

    =gt For giving information for 1 pixel we use 1 clock (for controlling all

    operations)

    1 clock = 1 pixel

    1 clock cycle period = 40ns

    Clock frequency = 1time period =140ns = 25 mhz

    1 clock frequency = 25 mhz

    So if we use 25mhz clock and display 1 pixel information for each and enery

    clock then we use can attain min 60hz refreshing rate

    do not use the clock frequency less than 25mhz

    Ex

    15MHz 25MHz

    30MHz

    Time period = more effectively utilized frequency

    Less

    No of framessec = less

    More

    Decreases

    Increases

    (It is unable to display 60framessec)

    Signals used for image display-

    R (Red)

    15

    G (Green)

    B (Blue)

    HSYNC (Horizontal Synchronous signal)

    VSYNC (Vertical Synchronous signal)

    All the above signals are used to control the image data to be correctly

    displayed on the VGA display monitor

    72 Timings for front port sync pulse back port-

    Horizontal sync Vertical sync

    Front port 094micros Front port 035ms

    Sync pulse 377 micros Sync pulse 0063ms

    Back port 189 micros Back port 102ms

    16

    8 FUNCTIONAL DESCRIPTION81 Symbol

    VGA Controller Symbol

    Pin description

    The pin description is shown in the below table

    Table VGA Pin description

    Name Type Polarity Bus size Description

    Global Control Signals

    CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

    17

    640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

    RST I High Global system reset

    VGA Input Settings Signals

    RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

    CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

    DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

    DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

    Data Memory Control Signals

    RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

    18

    the internal line enable signal en generated by the Synchronization Unit of the Controller

    DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

    ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

    VGA Monitor Control Signals

    HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

    VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

    19

    VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

    R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

    G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

    S

    B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

    20

    9 VGA SIGNAL TIMING

    Figure 3 summarizes the signal timing involved in sending a line of

    pixels and a frame of lines The actual time values differ according to the

    resolution selected (640x480 or 800x600) the processor-defined values for

    DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

    (CLK)

    21

    91 VGA PIXEL UNIT

    The Pixel Unit provides access to the pixilated image stored in external

    picture memory reading in data a byte at a time and formatting each pixel to

    be displayed For each pixel the 6-bit RGB value required for the control of

    the monitorrsquos three primary color electron guns is generated so that the pixel

    is displayed on the screen with the correct color

    Accessing the stored image

    The image to be displayed on the monitor screen is written by the host

    microcontroller into external memory space (RAM) This memory space can

    be located anywhere (eg using a block of RAM within the FPGA design or

    using the SRAM on the NanoBoard itself)

    Pixel data is stored in the memory space in bytes The number of pixels in a

    byte depends on which color palette is being used for the image (selected by

    the CMOD input)

    bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

    bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

    bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

    (DATA[76] are not used)

    22

    The size of memory required to store a picture is determined by the total

    number of viewable pixels in a line (determined by DISPSIZE_H) the total

    number of viewable lines in a frame (determined by DISPSIZE_V) and the

    number of pixels stored in each byte in memory space

    Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

    The address in RAM where the next pixel is stored is determined using

    an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

    reflects the current contents of the MEMADDR register The exact addressing

    is described below

    92 Black amp White mode

    The picture memory address ndash the byte of data containing the next 8 pixels

    ndash is determined by using bits 183 of hvcnt and right shifting the contents by

    three

    ADDR_PIXEL = 00 amp hvcnt[183]

    16 Colors mode

    The picture memory address ndash the byte of data containing the next 2 pixels

    ndash is determined by using bits 181 of hvcnt and right shifting the contents by

    one

    ADDR_PIXEL = 0 amp hvcnt[181]

    64 Colors mode

    The picture memory address ndash the byte of data containing the next pixel ndash is

    determined by using the full value of hvcnt

    ADDR_PIXEL = hvcnt[180]

    The Pixel register (PIXREG)

    The Pixel register is used to receive the byte of pixel data read from the

    current address in memory space The register PIXREG is reset to zero when

    the VGA Controller receives an external reset signal (RST)

    23

    The register is updated on each rising edge of the CLK signal Data can be

    read from the memory space as long as the RD signal is active (High) The RD

    signal is itself controlled by the external line display enable signal enex This

    internally generated signal is defined as follows

    If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

    (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

    Else

    enex = 1

    When enex is Low read access from memory is disabled (RD = 0)

    The point at which data is loaded from memory into PIXREG depends on the

    particular color palette that is chosen ndash Black amp White 16 Colors or 64

    Colors

    Black amp White mode

    The next byte of data will be loaded into the register whenever the

    lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

    are ldquo000rdquo

    For the currently loaded byte the active pixel is always in the lowest bit

    position of the Pixel register Each pixel in the data byte is moved into this

    active pixel position by shifting the contents of the register right by one bit on

    each rising edge of CLK

    16 Colors mode

    The next byte of data will be loaded into the register whenever the

    lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

    0

    For the currently loaded byte the active pixel is always in the low order

    nibble of the Pixel register Remember that in this mode each byte of data

    contains two pixels The second pixel is moved into this active pixel position

    by shifting the contents of the register right by four bits on the rising edge of

    CLK

    24

    64 Colors mode

    The next byte of data will be loaded into the register on the rising edge

    of the external system clock signal (CLK) In this mode the read of pixel data

    does not depend on the status of the hvcnt signal received from the

    Synchronization Unit

    93 The RGB register (RGBREG)

    The RGB register is used to store the six bits that are required for

    driving the red green and blue color guns of the target monitor When the

    chosen color palette is either Black amp White or 16 Colors these six bits are

    obtained by mapping the value of the active pixel to a predefined RGB code

    When the chosen color palette is 64 Colors the actual pixel value is used

    directly

    The register RGBREG is reset to zero (000000) when the VGA

    Controller receives an external reset signal (RST) This RGB code represents

    black

    The register is updated on each rising edge of the CLK signal and the

    value loaded is dependent on the state of the line display enable signal en

    When en is Low blanking is required and RGBREG is loaded with the code

    for black (000000)

    Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

    and a 4-bit pixel (16 Colors mode) into the required RGB color code

    25

    26

    27

    The RGB color code stored in the RGB register is output from the VGA

    Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

    and B1)

    The monitor itself expects analog signals as inputs to its electron gun control

    circuits This is achieved by using 2-bit digital to analog converter circuitry

    located on the Nano Board itself as shown in Figure 4

    Figure 91 Digital to Analog RGB conversion

    28

    For each color the 2-bit digital signal from the VGA Controller can be

    converted into 4 distinct analog levels These levels specify the intensity of

    each of the three primary colors to use when displaying the pixel on the

    monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

    brightness)

    With each analog input being one of four possible levels the monitor

    can display each pixel on the screen with one of 64 different color

    permutations

    Using Block RAM in Spartan-3E Generation

    FPGAs

    For applications requiring large on-chip memories Spartantrade-3

    Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

    Using various configuration options Select RAM blocks create RAM ROM

    FIFOs large look-up tables data width converterscircular buffers and shift

    registers each supporting various data widths and depths This application note

    describes the features and capabilities of block Select RAM and illustrates how

    to specify the various options using the Xilinx CORE Generatortrade system or

    via VHDL or Verilog instantiation

    29

    Table 2 Block RAM Available in Spartan-3E Devices

    Devi

    ce

    R

    A

    M

    Co

    lu

    m

    ns

    R

    A

    M

    Bl

    oc

    ks

    pe

    r

    C

    ol

    u

    m

    n

    T

    o

    t

    a

    l

    R

    A

    M

    B

    l

    o

    c

    k

    s

    T

    o

    t

    a

    l

    R

    A

    M

    B

    it

    s

    T

    o

    t

    a

    l

    R

    A

    M

    K

    b

    i

    t

    s

    XC3

    S100

    E

    1 4 4 7

    3

    7

    2

    8

    7

    2

    XC3

    S250

    E

    2 6 1

    2

    2

    2

    1

    1

    8

    4

    2

    1

    6

    XC3

    S500

    E

    2 10 2

    0

    3

    6

    8

    6

    4

    3

    6

    0

    30

    0

    XC3

    S120

    0E

    2 14 2

    8

    5

    1

    6

    0

    9

    6

    5

    0

    4

    XC3

    S160

    0E

    2 18 3

    6

    6

    6

    3

    5

    5

    2

    6

    4

    8

    Each block RAM contains 18432 bits of fast static RAM 16K bits of

    which is allocated to data storage and in some memory configurations an

    additional 2K bits allocated to parity or additional plus data bits Physically

    the block RAM memory has two completely independent access ports labeled

    Port A and Port B The structure is fully symmetrical and both ports are

    interchangeable and both ports support data read and write operations Each

    memory port is synchronous with its own clock clock enable and write

    enable Read operations are also synchronous and require a clock edge and

    clock enable Though physically a dual-port memory block RAM simulates

    single-port memory in an application as shown in Figure 1 Furthermore each

    block memory supports multiple configurations or aspect ratios Table 3

    summarizes the essential SelectRAM features Cascade multiple block RAMs

    to create deeper and wider memory organizations with a minimal timing

    penalty incurred through specialized routing resources

    31

    32

    10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

    Why (V) HDL

    Interoperability

    Technology independence

    Design reuse

    Several levels of abstraction

    Readability

    Standard language

    Widely supported

    What is VHDL

    VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

    IC)

    Design specification language

    Design entry language

    Design simulation language

    Design documentation language

    An alternative to schematics

    101 Brief History

    VHDL Was developed in the early 1980s for managing design problems that

    involved large circuits and multiple teams of engineers

    Funded by US Department of Defence

    33

    The first publicly available version was released in 1985

    In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

    presented with a proposal to standardize the VHDL

    In 1987 standardization =gt IEEE 1076-1987

    An improved version of the language was released in 1994 =gt IEEE

    standard1076-1993

    102 Related Standards

    IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

    impedance

    Soon after IEEE 1076-1987 was released simulator companies began using

    their own non-standard types =gt VHDL was becoming a nonstandard

    IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

    definitions for a nine-valued data type std_logic

    IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

    to actual hardware

    Defines eg two numeric types signed and unsigned

    VHDL Environment

    Design Units

    34

    Segments of VHDL code that can be compiled separately and stored in a

    library

    Entities

    A black box with interface definition

    Defines the inputsoutputs of a component (define pins)

    A way to represent modularity in VHDL

    Similar to symbol in schematic

    Entity declaration describes entity

    Eg

    entity Comparator is

    port (A B in std_logic_vector(7 downto0)

    EQ out std_logic)

    end Comparator

    Ports Provide channels of communication between the component and its

    environment

    Each port must have a name direction and a type

    An entity may have NO port declaration

    35

    Port directions In A value of a port can be read inside the component but cannot be assigned

    Multiple reads of port are allowed

    Out Assignments can be made to a port but data from a port cannot be read

    Multiple assignments are allowed

    In out Bi-directional assignments can be made and data can be read Multiple

    assignments are allowed

    Buffer An out port with read capability May have at most one assignment

    (are not recommended)

    Architectures Every entity has at least one architecture

    One entity can have several architectures

    Architectures can describe design using

    BehaviorndashStructurendashDataflow

    Architectures can describe design on many levelsndashGate levelndashRTL (Register

    Transfer Level)ndashBehavioral level

    Configuration declaration links architecture to entity

    Eg

    Architecture Comparator1 of Comparator is

    Begin

    EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

    End Comparator1

    Configurations Links entity declaration and architecture body together

    Concept of default configuration is a bit messy in VHDL lsquo87

    ndashLast architecture analyzed links to entity

    Can be used to change simulation behavior without re-analyzing the VHDL

    source

    Complex configuration declarations are ignored in synthesis

    36

    Some entities can have eggate level architecture and behavioral

    architecture

    Are always optional

    PackagesPackages contain information common to many design units

    1 Package declaration

    --constant declarations

    ndashtype and subtype declarations

    ndashfunction and procedure declarations

    ndashglobal signal declarations

    ndashfile declarations

    ndashcomponent declarations

    2 Package body

    ndashis not necessary needed

    ndashfunction bodies

    ndashprocedure bodies

    Packages are meant for encapsuling data which can be shared globally among

    Several design units

    Consists of declaration part and optional body part

    Package declaration can contain

    ndashtype and subtype declarations

    ndashsubprograms

    ndashconstants

    Alias declarations

    ndashglobal signal declarations

    ndashfile declarations

    ndashcomponent declarations

    37

    Package body consists of

    ndashsubprogram declarations and bodies

    ndashtype and subtype declarations

    ndash deferred constants

    ndash file declarations

    LibrariesCollection of VHDL design units (database)

    1 Packages

    package declaration

    package body

    2 Entities (entity declaration)

    3 Architectures (architecture body)

    4 Configurations (configuration declarations)

    Usually directory in UNIX file system

    Can be also any other kind of database

    Levels of Abstraction

    VHDL supports many possible styles of design description which differ

    primarily in how closely they relate to the HW

    It is possible to describe a circuit in a number of ways

    Structural-------

    Dataflow ------- Higher level of abstraction

    Behavioral -------

    Structural VHDL description

    Circuit is described in terms of its components

    From a low-level description (eg transistor-level description) to a high level

    description (eg block diagram)

    For large circuits a low-level description quickly becomes impractical

    103 Dataflow VHDL Description

    38

    Circuit is described in terms of how data moves through the system

    In the dataflow style you describe how information flows between registers in

    the system

    The combinational logic is described at a relatively high level the placement

    and operation of registers is specified quite precisely

    The behavior of the system over the time is defined by registers

    There are no build-in registers in VHDL-language

    ndashEither lower level description

    ndashor behavioral description of sequential elements is needed

    The lower level register descriptions must be created or obtained

    If there is no 3rd party models for registers =gt you must write the behavioral

    description of registers

    The behavioral description can be provided in the form of

    subprograms(functions or procedures)

    104 Behavioral VHDL Description

    Circuit is described in terms of its operation over time

    39

    Representation might include eg state diagrams timing diagrams and

    algorithmic descriptions

    The concept of time may be expressed precisely using delays (eg A lt= B

    after 10 ns)

    If no actual delay is used order of sequential operations is defined

    In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

    timing specifications

    The actual timing results depend on implementation technology and efficiency

    of synthesis tool

    There are a few tools for behavioral synthesis

    Concurrent Vs Sequential

    Processes

    Basic simulation concept in VHDL

    VHDL description can always be broken up to interconnected processes

    Quite similar to Unix process

    40

    Process keyword in VHDL

    Process statement is concurrent statement

    Statements inside process statements are sequential statements

    Process must contain either sensitivity list or wait statement(s) but NOT both

    Sensitivity list or wait statement(s) contains signals which wakes process up

    General format

    Process [(sensitivity_list)]

    process_declarative_part

    Begin

    process_statements

    [wait_statement]

    End process

    41

    11 VGA CONTROLLER CODE

    library IEEE

    use IEEESTD_LOGIC_1164ALL

    use IEEESTD_LOGIC_ARITHALL

    use IEEESTD_LOGIC_UNSIGNEDALL

    use IEEEnumeric_stdALL

    ---- Uncomment the following library declaration if instantiating

    ---- any Xilinx primitives in this code

    --library UNISIM

    --use UNISIMVComponentsall

    entity vga_controller is port(clk50min std_logic

    rstnin std_logic

    hsyncout std_logic

    vsyncout std_logic

    redout std_logic_vector(0 downto 0)

    greenout std_logic_vector(0 downto 0)

    blueout std_logic_vector(0 downto 0))

    end vga_controller

    architecture Behavioral of vga_controller is

    component memory

    port (

    addr IN std_logic_VECTOR(17 downto 0)

    clk IN std_logic

    dout OUT std_logic_VECTOR(0 downto 0)

    en IN std_logic

    sinit IN std_logic)

    END component

    42

    component sync_unit port(

    clkin std_logic

    rstnin std_logic

    hsyncout std_logic

    ---horizontal synch pulse(row)

    vsyncout std_logic

    --vertical sync pulse(frame) 111110100000000000

    v_en_regout std_logic

    h_enout std_logic

    v_enout std_logic

    line_countinout integer

    c_hsinout integer)

    end component

    component pixelgen port(

    clkin std_logic

    rstnin std_logic

    hsyncin std_logic

    vsyncin std_logic

    h_enin std_logic

    v_enin std_logic

    v_en_reg in std_logic

    data_inin std_logic_vector(0 downto 0)

    line_countin integer

    c_hsin integer

    red out std_logic_vector(0 downto 0)

    greenout std_logic_vector(0 downto 0)

    blueout std_logic_vector(0 downto 0)

    addrout std_logic_vector(17 downto 0)

    rdout std_logic)

    end component

    43

    signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

    signal line_count_sc_hs_saddrinteger

    signal data_sstd_logic_vector(0 downto 0)

    signal addr_sstd_logic_vector(17 downto 0)

    begin

    --addr_s lt= std_logic_vector(to_unsigned(addr18))

    process(clk50mrstn)

    begin

    if(rstn = 0)then

    clk lt= 0

    elsif(clk50m = 1 and clk50mevent)then

    clk lt= not clk

    end if

    --end if

    end process

    hsync lt= hsync_s

    vsync lt= vsync_s

    sync sync_unit port map(

    clk =gt clk

    rstn =gt rstn

    hsync =gt hsync_s

    vsync =gt vsync_s

    v_en_reg =gt v_en_reg_s

    line_count =gt line_count_s

    h_en =gt h_en_s

    v_en =gt v_en_s

    c_hs =gt c_hs_s

    )

    44

    pixel pixelgen port map(

    clk =gt clk

    rstn =gt rstn

    hsync =gt hsync_s

    vsync =gt vsync_s

    h_en =gt h_en_s

    v_en =gt v_en_s

    v_en_reg =gt v_en_reg_s

    data_in =gt data_s

    line_count=gt line_count_s

    c_hs =gt c_hs_s

    red =gt red

    green =gt green

    blue =gt blue

    addr =gt addr_s

    rd =gt rd_s

    )

    mem12 memory port map(

    addr =gt addr_s

    clk =gt clk

    dout =gt data_s

    en =gt rd_s

    sinit=gt rstn

    )

    end Behavioral

    45

    SYNCHRONIZATION BLOCK CODE

    library IEEE

    use IEEESTD_LOGIC_1164ALL

    use IEEESTD_LOGIC_ARITHALL

    use IEEESTD_LOGIC_UNSIGNEDALL

    entity sync_unit is port(

    clkin std_logic

    rstnin std_logic

    hsyncout std_logic ---horizontal synch pulse(row)

    vsyncout std_logic --vertical sync

    pulse(frame) 111110100000000000

    v_en_regout std_logic

    h_enout std_logic

    v_enout std_logic

    line_countout integer

    c_hsout integer )

    end sync_unit

    architecture arch_sync_unit of sync_unit is

    signal h_en_sv_en_sstd_logic

    signal line_count_sc_hs_sinteger

    begin

    c_hs lt= c_hs_s

    line_count lt= line_count_s

    h_en lt= h_en_s

    v_en lt= v_en_s

    -- hsync counter

    process(clkrstn)

    begin

    if(rstn = 0)then

    46

    c_hs_s lt= 0

    else

    if(clk = 1 and clkevent)then

    if(c_hs_s=793)then

    c_hs_s lt= 0

    else

    c_hs_s lt= c_hs_s +1

    end if

    end if

    end if

    end process

    --vsync counter

    --vertical line counter

    process(h_en_srstn)

    begin

    if(rstn = 0)then

    line_count_s lt= 0

    elsif(h_en_s = 1 and h_en_sevent)then

    line_count_s lt= line_count_s + 1

    if (line_count_s=515)then

    line_count_s lt= 0

    end if

    end if

    --end if

    end process

    --hysnc pulse

    process(clkrstn)

    begin

    if(rstn = 0)then

    47

    hsync lt= 1

    elsif(clk = 1 and clkevent)then

    if(c_hs_s lt=95)then

    hsync lt= 0

    else

    hsync lt= 1

    end if

    end if

    --end if

    end process

    process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

    this period)

    begin

    if(rstn = 0)then

    h_en_s lt= 0

    elsif(clk = 1 and clkevent)then

    if((c_hs_s gt=143) and (c_hs_slt=778))then

    h_en_s lt= 1

    else

    h_en_s lt= 0

    end if

    end if

    --end if

    end process

    ---vysnc pulse

    process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

    begin

    if(rstn = 0)then

    48

    v_en_s lt= 0

    elsif(clk = 1 and clkevent)then

    if((line_count_sgt=35 and line_count_slt=515))then

    v_en_s lt= 1

    else

    v_en_s lt= 0

    end if

    end if

    --end if

    end process

    process(clkrstn) --Vertical Sync Front porch Back Porch

    begin

    if(rstn = 0)then

    vsync lt= 1

    elsif(clk = 1 and clkevent)then

    if(line_count_s lt2)then

    vsync lt= 0

    else

    vsync lt= 1

    end if

    end if

    --end if

    end process

    process(clk)

    begin

    if(clk = 1 and clkevent)then

    v_en_reglt=v_en_s

    end if

    end process

    end arch_sync_unit

    49

    PIXEL GENERATOR

    library IEEE

    use IEEESTD_LOGIC_1164ALL

    use IEEESTD_LOGIC_ARITHALL

    use IEEESTD_LOGIC_UNSIGNEDALL

    entity pixelgen is port(

    clkin std_logic

    rstnin std_logic

    hsyncin std_logic

    vsyncin std_logic

    h_enin std_logic

    v_enin std_logic

    v_en_reg in std_logic

    data_inin std_logic_vector(0 downto 0)

    line_countin integer

    c_hsin integer

    red out std_logic_vector(0 downto 0)

    greenout std_logic_vector(0 downto 0)

    blueout std_logic_vector(0 downto 0)

    addrout std_logic_vector(17 downto 0)

    rdout std_logic

    )

    end pixelgen

    architecture behv of pixelgen is

    signal addr_sstd_logic_vector(17 downto 0)

    begin

    addr lt= addr_s

    process(clkrstn)

    begin

    50

    if(rstn = 0)then

    rd lt= 0

    red lt= 0

    green lt= 0

    blue lt= 0

    addr_s lt= x0000 amp 00

    elsif (clk = 1 and clkevent)then

    if((v_en=1) and (v_en_reg=0))then

    addr_slt=(others =gt0)

    end if

    if(v_en=1 and h_en=1)then

    if(line_countlt=514)then

    if(c_hs lt= 655)then

    rd lt= 1

    red lt= data_in

    green lt= data_in

    blue lt= data_in

    addr_s lt=addr_s +000000000000000001

    else

    rd lt= 0

    red lt= 0

    green lt= 1

    blue lt= 0

    end if

    else

    rd lt= 0

    red lt= 1

    green lt= 1

    blue lt= 1

    51

    end if

    if(addr_s=111011111111111111)then

    addr_slt= (others =gt0)

    end if

    else

    red lt= 0

    green lt= 0

    blue lt= 0

    end if

    end if

    -- end if

    --end if

    end process

    end behv

    52

    12 FPGA ARCHITECTURE

    A field-programmable gate array (FPGA) is a semiconductor device that can

    be configured by the customer or designer after manufacturingmdashhence the

    name field-programmable FPGAs are programmed using a logic circuit

    diagram or a source code in a hardware description language (HDL) to specify

    how the chip will work They can be used to implement any logical function

    that an application-specific integrated circuit (ASIC) could perform but the

    ability to update the functionality after shipping offers advantages for many

    applications

    FPGAs contain programmable logic components called logic blocks and a

    hierarchy of reconfigurable interconnects that allow the blocks to be wired

    togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

    can be configured to perform complex combinational functions or merely

    simple logic gates like AND and XOR In most FPGAs the logic blocks also

    include memory elements which may be simple flip-flops or more complete

    blocks of memory

    History The FPGA industry sprouted from programmable read only

    memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

    both had the option of being programmed in batches in a factory or in the field

    (field programmable) however programmable logic was hard-wired between

    logic gates

    Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

    first commercially viable field programmable gate array in 1985 ndash the

    XC2064 The XC2064 had programmable gates and programmable

    interconnects between gates the beginnings of a new technology and market

    The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

    input lookup tables (LUTs) More than 20 years later Freeman was entered

    into the National Inventors Hall of Fame for his invention

    53

    Some of the industryrsquos foundational concepts and technologies for

    programmable logic arrays gates and logic blocks are founded in patents

    awarded to David W Page and LuVerne R Peterson in 1985

    In the late 1980s the Naval Surface Warfare Department funded an experiment

    proposed by Steve Casselman to develop a computer that would implement

    600000 reprogrammable gates Casselman was successful and the system was

    awarded a patent in 1992

    Xilinx continued unchallenged and quickly growing from 1985 to the mid-

    1990s when competitors sprouted up eroding significant market-share By

    1993 Actel was serving about 18 percent of the market

    The 1990s were an explosive period of time for FPGAs both in sophistication

    and the volume of production In the early 1990s FPGAs were primarily used

    in telecommunications and networking By the end of the decade FPGAs

    found their way into consumer automotive and industrial applications

    FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

    algorithm technology and FPGAs to create a sound recognition device

    Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

    to decide the configuration needed to accomplish a sound recognition task

    121 Modern developments

    A recent trend has been to take the coarse-grained architectural

    approach a step further by combining the logic blocks and interconnects of

    traditional FPGAs with embedded microprocessors and related peripherals to

    form a complete system on a programmable chip This work mirrors the

    architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

    Group which combined a reconfigurable CPU architecture on a single chip

    called the SB24 That work was done in 1982 Examples of such hybrid

    technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

    which include one or more PowerPC processors embedded within the FPGAs

    54

    logic fabric The Atmel FPSLIC is another such device which uses an AVR

    processor in combination with Atmels programmable logic architecture

    An alternate approach to using hard-macro processors is to make use of soft

    processor cores that are implemented within the FPGA logic (See Soft

    processors below)

    As previously mentioned many modern FPGAs have the ability to be

    reprogrammed at run time and this is leading to the idea of reconfigurable

    computing or reconfigurable systems mdash CPUs that reconfigure themselves to

    suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

    example of a reconfigurable soft processor implemented on FPGAs However

    it does not support dynamic reconfiguration at runtime but instead adapts itself

    to a specific program

    Additionally new non-FPGA architectures are beginning to emerge

    Software-configurable microprocessors such as the Stretch S5000 adopt a

    hybrid approach by providing an array of processor cores and FPGA-like

    programmable cores on the same chip

    Gates

    1987 9000 gates Xilinx

    1992 600000 Naval Surface Warfare Department

    Early 2000s Millions

    Market size

    1985 First commercial FPGA technology invented by Xilinx

    1987 $14 million

    ~1993 gt$385 million

    2005 $19 billion

    2010 estimates $275 billion

    55

    122 FPGA Comparisons

    Historically FPGAs have been slower less energy efficient and generally

    achieved less functionality than their fixed ASIC counterparts A combination

    of volume fabrication improvements research and development and the IO

    capabilities of new supercomputers have largely closed the performance gap

    between ASICs and FPGAs

    Advantages include a shorter time to market ability to re-program in the field

    to fix bugs and lower non-recurring engineering costs Vendors can also take a

    middle road by developing their hardware on ordinary FPGAs but

    manufacture their final version so it can no longer be modified after the design

    has been committed

    Xilinx claims that several market and technology dynamics are changing the

    ASICFPGA paradigm

    IC costs are rising aggressively

    ASIC complexity has bolstered development time and costs

    RampD resources and headcount is decreasing

    Revenue losses for slow time-to-market are increasing

    Financial constraints in a poor economy are driving low-cost technologies

    These trends make FPGAs a better alternative than ASICs for a growing

    number of higher-volume applications than they have been historically used

    for which the company blames for the growing number of FPGA design starts

    (see History)

    The primary differences between CPLDs and FPGAs are architectural A

    CPLD has a somewhat restrictive structure consisting of one or more

    programmable sum-of-products logic arrays feeding a relatively small number

    of clocked registers The result of this is less flexibility with the advantage of

    more predictable timing delays and a higher logic-to-interconnect ratio The

    FPGA architectures on the other hand are dominated by interconnect This

    56

    makes them far more flexible (in terms of the range of designs that are

    practical for implementation within them) but also far more complex to design

    for

    Another notable difference between CPLDs and FPGAs is the presence in most

    FPGAs of higher-level embedded functions (such as adders and multipliers)

    and embedded memories as well as to have logic blocks implement decoders

    or mathematical functions

    Some FPGAs have the capability of partial re-configuration that lets one

    portion of the device be re-programmed while other portions continue running

    123 Applications

    Applications of FPGAs include digital signal processing software-defined

    radio aerospace and defense systems ASIC prototyping medical imaging

    computer vision speech recognition cryptography bioinformatics computer

    hardware emulation radio astronomy and a growing range of other areas

    FPGAs originally began as competitors to CPLDs and competed in a similar

    space that of glue logic for PCBs As their size capabilities and speed

    increased they began to take over larger and larger functions to the state where

    some are now marketed as full systems on chips (SoC) Particularly with the

    introduction of dedicated multipliers into FPGA architectures in the late 1990s

    applications which had traditionally been the sole reserve of DSPs began to

    incorporate FPGAs instead

    FPGAs especially find applications in any area or algorithm that can make use

    of the massive parallelism offered by their architecture One such area is code

    breaking in particular brute-force attack of cryptographic algorithms

    FPGAs are increasingly used in conventional high performance computing

    applications where computational kernels such as FFT or Convolution are

    performed on the FPGA instead of a microprocessor

    57

    The inherent parallelism of the logic resources on an FPGA allows for

    considerable computational throughput even at a low MHz clock rates The

    flexibility of the FPGA allows for even higher performance by trading off

    precision and range in the number format for an increased number of parallel

    arithmetic units This has driven a new type of processing called

    reconfigurable computing where time intensive tasks are offloaded from

    software to FPGAs

    The adoption of FPGAs in high performance computing is currently limited by

    the complexity of FPGA design compared to conventional software and the

    extremely long turn-around times of current design tools where 4-8 hours wait

    is necessary after even minor changes to the source code

    Traditionally FPGAs have been reserved for specific vertical applications

    where the volume of production is small For these low-volume applications

    the premium that companies pay in hardware costs per unit for a

    programmable chip is more affordable than the development resources spent

    on creating an ASIC for a low-volume application Today new cost and

    performance dynamics have broadened the range of viable applications

    Architecture

    The most common FPGA architecture consists of an array of configurable

    logic blocks (CLBs) IO pads and routing channels Generally all the routing

    channels have the same width (number of wires) Multiple IO pads may fit

    into the height of one row or the width of one column in the array

    An application circuit must be mapped into an FPGA with adequate resources

    While the number of CLBs and IOs required is easily determined from the

    design the number of routing tracks needed may vary considerably even

    among designs with the same amount of logic (For example a crossbar switch

    requires much more routing than a systolic array with the same gate count)

    Since unused routing tracks increase the cost (and decrease the performance)

    of the part without providing any benefit FPGA manufacturers try to provide

    58

    just enough tracks so that most designs that will fit in terms of LUTs and IOs

    can be routed This is determined by estimates such as those derived from

    Rents rule or by experiments with existing designs

    The FPGA is an array or island-style FPGA It consists of an array of logic

    blocks and routing channels Two IO pads fit into the height of one row or the

    width of one column as shown below All the routing channels have the same

    width (number of wires)

    121 FPGA structure

    A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

    flip-flop as shown below In recent years manufacturers have started moving

    to 6-input LUTs in their high performance parts claiming increased

    performance

    Typical logic block

    59

    There is only one output which can be either the registered or the unregistered

    LUT output The logic block has four inputs for the LUT and a clock input

    Since clock signals (and often other high-fanout signals) are normally routed

    via special-purpose dedicated routing networks in commercial FPGAs they

    and other signals are separately managed

    For this example architecture the locations of the FPGA logic block pins are

    shown below

    Logic Block Pin Locations

    Each input is accessible from one side of the logic block while the output pin

    can connect to routing wires in both the channel to the right and the channel

    below the logic block

    Each logic block output pin can connect to any of the wiring segments in the

    channels adjacent to it

    Similarly an IO pad can connect to any one of the wiring segments in the

    channel adjacent to it For example an IO pad at the top of the chip can

    connect to any of the W wires (where W is the channel width) in the horizontal

    channel immediately below it

    Generally the FPGA routing is unsegmented That is each wiring segment

    spans only one logic block before it terminates in a switch box By turning on

    60

    some of the programmable switches within a switch box longer paths can be

    constructed For higher speed interconnect some FPGA architectures use

    longer routing lines that span multiple logic blocks

    Whenever a vertical and a horizontal channel intersect there is a switch box In

    this architecture when a wire enters a switch box there are three

    programmable switches that allow it to connect to three other wires in adjacent

    channel segments The pattern or topology of switches used in this

    architecture is the planar or domain-based switch box topology In this switch

    box topology a wire in track number one connects only to wires in track

    number one in adjacent channel segments wires in track number 2 connect

    only to other wires in track number 2 and so on The figure below illustrates

    the connections in a switch box

    Switch box topology

    Modern FPGA families expand upon the above capabilities to include higher

    level functionality fixed into the silicon Having these common functions

    embedded into the silicon reduces the area required and gives those functions

    increased speed compared to building them from primitives Examples of these

    include multipliers generic DSP blocks embedded processors high speed IO

    logic and embedded memories

    61

    FPGAs are also widely used for systems validation including pre-silicon

    validation post-silicon validation and firmware development This allows chip

    companies to validate their design before the chip is produced in the factory

    reducing the time to market

    124 FPGA Design and Programming

    To define the behavior of the FPGA the user provides a hardware description

    language (HDL) or a schematic design The HDL form might be easier to work

    with when handling large structures because its possible to just specify them

    numerically rather than having to draw every piece by hand On the other hand

    schematic entry can allow for easier visualisation of a design

    Then using an electronic design automation tool a technology-mapped netlist

    is generated The netlist can then be fitted to the actual FPGA architecture

    using a process called place-and-route usually performed by the FPGA

    companys proprietary place-and-route software The user will validate the

    map place and route results via timing analysis simulation and other

    verification methodologies Once the design and validation process is

    complete the binary file generated (also using the FPGA companys

    proprietary software) is used to (re)configure the FPGA

    Going from schematicHDL source files to actual configuration The source

    files are fed to a software suite from the FPGACPLD vendor that through

    different steps will produce a file This file is then transferred to the

    FPGACPLD via a serial interface (JTAG) or to an external memory device

    like an EEPROM

    The most common HDLs are VHDL and Verilog although in an attempt to

    reduce the complexity of designing in HDLs which have been compared to the

    equivalent of assembly languages there are moves to raise the abstraction level

    through the introduction of alternative languages

    62

    To simplify the design of complex systems in FPGAs there exist libraries of

    predefined complex functions and circuits that have been tested and optimized

    to speed up the design process These predefined circuits are commonly called

    IP cores and are available from FPGA vendors and third-party IP suppliers

    (rarely free and typically released under proprietary licenses) Other

    predefined circuits are available from developer communities such as Open

    Cores (typically free and released under the GPL BSD or similar license) and

    other sources

    In a typical design flow an FPGA application developer will simulate the

    design at multiple stages throughout the design process Initially the RTL

    description in VHDL or Verilog is simulated by creating test benches to

    simulate the system and observe results Then after the synthesis engine has

    mapped the design to a netlist the netlist is translated to a gate level

    description where simulation is repeated to confirm the synthesis proceeded

    without errors Finally the design is laid out in the FPGA at which point

    propagation delays can be added and the simulation run again with these

    values back-annotated onto the netlist

    Basic Process Technology Types

    SRAM - based on static memory technology In-system programmable and re-

    programmable Requires external boot devices CMOS

    Antifuse - One-time programmable CMOS

    EPROM - Erasable Programmable Read-Only Memory technology Usually

    one-time programmable in production because of plastic packaging

    Windowed devices can be erased with ultraviolet (UV) light CMOS

    EEPROM - Electrically Erasable Programmable Read-Only Memory

    technology Can be erased even in plastic packages Some but not all

    EEPROM devices can be in-system programmed CMOS

    Flash - Flash-erase EPROM technology Can be erased even in plastic

    packages Some but not all flash devices can be in-system programmed

    63

    Usually a flash cell is smaller than an equivalent EEPROM cell and is

    therefore less expensive to manufacture CMOS

    Fuse - One-time programmable Bipolar

    Major Manufacturers

    Xilinx and Altera are the current FPGA market leaders and long-time industry

    rivals Together they control over 80 percent of the market with Xilinx alone

    representing over 50 percent

    Xilinx also provides free Windows and Linux design software while Altera

    provides free Windows tools the Solaris and Linux tools are only available via

    a rental scheme

    Other competitors include Lattice Semiconductor (flash SRAM) Actel

    (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

    Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

    focused CSSP no general purpose FPGAs)

    125 FPGA prototype

    FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

    prototyping is the method to prototype SoC and ASIC design on FPGA for

    hardware verification and early software development

    Main stream verification methods for hardware design and early software and

    firmware co-design has become mainstream Prototyping SoC and ASIC

    design on FPGA has become a good method to do this

    64

    Reasons why Prototyping is important

    1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

    functionally correct This is compared to designers only relying on software

    simulations to verify that their hardware design is sound Simulation speed and

    modeling accuracy limitations hinder this development

    2 Due to time constrains many projects cannot wait until the silicon is back from

    the foundry to start on software tests FPGA prototyping allows for much more

    time in area of software development and testing at the software-hardware

    integration stage This allows many unforeseen software bugs that appear due

    to todays array of operating systems applications and hardware

    3 Prototyping also allows the developer to ensure that all IP technologies on his

    system work well together off the simulation stage and in actual form

    4 Prototyping has the added advantage as demo platforms to SoC clients

    bringing in interest early This speeds up the overall development cycle and

    allows for more enhancement or improvement to the chip features as it would

    otherwise have been

    65

    13 SIMULATION RESULTS

    66

    GATE LEVEL

    FigVGA CONTROLLER

    67

    GATE LEVEL

    Fig SYNC UNIT

    68

    TECHNOLOGY SCHEMATIC

    69

    14 APPLICATIONS

    Motion Tracking Systems

    Image processing systems

    Displaying Systems

    141 CONCLUTION

    Hence by implementing FPGA taking a processor-generated picture

    (pixilated) from memory space the Controller provides digital RGB values for

    each pixel as well as horizontal and vertical synchronization signals in order

    to correctly display the picture on a connected monitor

    70

    REFERENCES

    1 httpenwikipediaorgwikiVideo_Graphics_Array

    2 A VGA display controller by Eduardo Sanchez

    3 HinnerrdquoVGA Interface and video signal documentsrdquo

    71

    • Pixel
      • 6 SIGNALS IN VGA CONTROLLER
      • Gates
      • Market size
      • 122 FPGA Comparisons

      22 The VGA specifications are as follows

      256 KB Video RAM (The very first cards could be ordered with 64KB or

      128KB of RAM at the cost of losing some video modes)

      16-color and 256-color modes

      262144-value color palette (six bits each for red green and blue)

      Selectable 25175 MHz or 28322 MHz master clock

      Maximum of 800 horizontal pixels

      Maximum of 600 lines

      Refresh rates at up to 70 Hz

      Vertical blank interrupt

      Packed-pixel mode 256 colors

      Hardware smooth scrolling support

      The VGA supports both All Points Addressable graphics modes and

      alphanumeric text modes Standard graphics modes are

      640times480 in 16 colors

      640times350 in 16 colors

      320times200 in 16 colors

      320times200 in 256 colors

      3

      3 DESCRIPTION ARCHITECTURE

      31 VGA display device

      VGA display device used for displaying the images taken from the

      processor in an exact format It can be used for displaying the output of any

      projects Analog graphs etchellip can be directly displayed on the VGA display

      monitor

      31 VGA (Video Graphics Array) Controller

      VGA controller controls the image to be exactly displayed on the display

      device It acts as an interface between processing system and display device so

      called as interface controller

      4

      Image or video processing system

      VGA Controller

      Display

      4 CONTROLLER DESIGN

      For this first we should know how the display device works Consider an

      image below

      640x480

      Original image

      5

      Pixels of image

      Display device also have pixels Display resolution (640480) is a

      standard resolution of display

      Pixel

      This article is about the picture element For other uses see Pixel

      (disambiguation)

      This example shows an image with a portion greatly enlarged in which the

      individual pixels are rendered as little squares and can easily be seen

      6

      42 A photograph of sub-pixel display elements on a laptops LCD screen

      In digital imaging a pixel (or picture element [1]) is a single point in a raster

      image The pixel is the smallest addressable screen element it is the smallest

      unit of picture which can be controlled Each Pixel has its address The address

      of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

      dimensional grid and are often represented using dots or squares Each pixel is

      a sample of an original image where more samples typically provide more-

      accurate representations of the original The intensity of each pixel is variable

      In color image systems a color is typically represented by three or four

      component intensities such as red green and blue or cyan magenta yellow

      and black

      Color refreshing rate

      Color refreshing rate =60Hz

      Color refreshing rate represents the number of frames that are transmitted per

      second

      Minimum number of frames that can be transmitted per second by VGA

      controller is 60

      7

      5 Hardware Description

      51 Block Diagram

      51 VGA Synchronization Unit

      8

      The Synchronization Unit provides the horizontal and vertical

      synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

      display a picture frame within the confines of a monitorrsquos display area

      These synchronization signals are used as control inputs by the

      monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

      electrons emitted by the three primary color electron guns (Red Green Blue)

      left to right and from top to bottom respectively HSYNC provides the start

      and stop times for the horizontal deflection circuit so that a line of pixels is

      correctly drawn across the screen display VSYNC provides the start and stop

      times for the vertical deflection circuit so that the lines of a frame are correctly

      drawn from the top to the bottom of the screen display

      The resolution for the display is defined by the level on the

      RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

      the 800x600 resolution (SVGA) is used

      Although the resolution determines the area of a monitorrsquos screen

      within which an image can be displayed the full extents of the chosen

      resolution do not have to be used The actual extents of the image display area

      on the screen can be controlled by the use of the DISPSIZE_H and

      DISPSIZE_V inputs These inputs determine the total number of pixels to be

      used in a line and the total number of lines in a frame respectively

      52 Blank pixel generation

      The total number of viewable pixels in a line and viewable lines in a

      frame is determined by the display resolution chosen through the

      RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

      the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

      chosen display resolution is used or not the areas of the monitor screen to the

      top bottom left and right of the viewable frame area are blanked by putting

      black pixels at the required line-pixel positions This has the effect of centering

      the image on the screen

      9

      The color generated for a pixel in the Pixel Unit depends on whether

      the particular pixel requires to be blanked or not The Synchronization Unit

      provides a signal to the Pixel Unit for this very reason This is the line display

      enable signal - en The signal is checked on each rising edge of the external

      clock signal (CLK) and is set as follows

      If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

      en = 0 (pixel requires to be blanked ndash set color to be black)

      Else

      en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

      6 SIGNALS IN VGA CONTROLLER

      For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

      timings are

      Parameter Value Unit

      Clock frequency 25175 MHz

      Horizontal pixels 640

      Horizontal sync polarity Negative

      Total time for each line 3177 micros

      Front porch (A) 094 micros

      Sync pulse length (B) 377 micros

      Back porch (C) 189 micros

      Active video (D) 2517 micros

      (Total horizontal sync time 660 micros)

      10

      61 The vertical timings are

      Parameter Value Unit

      Vertical lines 480

      Vertical sync polarity Negative

      Vertical frequency 5994 Hz

      Front porch (E) 035 ms

      Sync pulse length (F) 006 ms

      Back porch (G) 102 ms

      Active video (H) 1525 ms

      (Total vertical sync time 143 ms)

      640 x 400 70 Hz is video mode used for booting most x86 personal

      computers

      640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

      It should be noted that the actual timings vary slightly For example for

      640x480 60fps a 2517 micros active video time with a pixel frequency of

      25174 MHz gives 633 pixels rather than the expected 640 pixels

      62 Generation of the horizontal synchronization signal ndash HSYNC

      11

      The HSYNC signal is High (inactive) after an external reset signal

      (RST) is received by the VGA Controller The signal is updated on each rising

      edge of the external clock signal (CLK)

      The state of the HSYNC signal depends on the value stored in the horizontal

      counter and is driven low when

      HCOUNT ge (PixTotal + BlankingLeft)

      and remains low while

      HCOUNT lt (PixTotal + BlankingLeft + TLSync)

      Vertical (Frame) Period

      63 Generation of the vertical synchronization signal - VSYNC

      The VSYNC signal is High (inactive) after an external reset signal

      (RST) is received by the VGA Controller The signal is updated after every

      line of pixels is completed (ie on each rising edge of the HSYNC signal)

      The state of the VSYNC signal depends on the value stored in the vertical

      counter and is driven low when

      VCOUNT ge (LinTotal + BlankingLeft)

      and remains low while

      VCOUNT lt (LinTotal + BlankingLeft + TFSync)

      Address Counter (MEMADDR)

      The address counter is used to store the position of the next consecutive

      pixel in the frame Its value is passed to the Pixel Unit on the internal bus

      signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

      the next pixel from picture memory

      The counter is reset to zero when the VGA Controller receives an external

      reset signal (RST) The size of the counter depends on the values chosen for

      DISPSIZE_H and DISPSIZE_V as the range is simply

      12

      0 to (PixTotal x LinTotal) - 1

      Taking the maximum number of pixels in a line and lines in a frame for each

      of the supported display resolutions the maximum ranges for the counter are

      bull 640x480 0 to 307199

      bull 800x600 0 to 479999

      The counter has 19-bit resolution

      While the value in the horizontal counter (HCOUNT) is less than the

      total number of viewable pixels in a line (PixTotal the integer value of

      DISPSIZE_H) the counter is incremented on the rising edge of the external

      clock signal (CLK) Pixel addressing within the frame is consecutive When

      the counter reaches the last pixel in a line its incremented value is the first

      pixel in the next line down

      The address counter will continue to be incremented until the value in the

      vertical counter (VCOUNT) is greater than or equal to the total number of

      viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

      point it will be rolled over to zero

      13

      7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

      microcontroller and any VGA-compatible monitor This variant of the

      Controller provides six modes of display depending on the resolution chosen

      (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

      White 16 Colors or 64 Colors)

      71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

      For 1sec = 60 frames

      1 frame = max 160 sec

      1 pixel = 160 sec = 50ns

      307200

      Max Time for 1 pixel = 40ns

      2) To send information for each and every pixel of a frame by means of

      RGB signals

      3) After completely displaying 1 row information the column must be

      refreshed and the row has to be incremented ie done by Hsync signal

      4) After completing 1 frame of information a new frame has to be started

      for this row and column are to be refreshed ie done by Vsync signal

      14

      =gt For giving information for 1 pixel we use 1 clock (for controlling all

      operations)

      1 clock = 1 pixel

      1 clock cycle period = 40ns

      Clock frequency = 1time period =140ns = 25 mhz

      1 clock frequency = 25 mhz

      So if we use 25mhz clock and display 1 pixel information for each and enery

      clock then we use can attain min 60hz refreshing rate

      do not use the clock frequency less than 25mhz

      Ex

      15MHz 25MHz

      30MHz

      Time period = more effectively utilized frequency

      Less

      No of framessec = less

      More

      Decreases

      Increases

      (It is unable to display 60framessec)

      Signals used for image display-

      R (Red)

      15

      G (Green)

      B (Blue)

      HSYNC (Horizontal Synchronous signal)

      VSYNC (Vertical Synchronous signal)

      All the above signals are used to control the image data to be correctly

      displayed on the VGA display monitor

      72 Timings for front port sync pulse back port-

      Horizontal sync Vertical sync

      Front port 094micros Front port 035ms

      Sync pulse 377 micros Sync pulse 0063ms

      Back port 189 micros Back port 102ms

      16

      8 FUNCTIONAL DESCRIPTION81 Symbol

      VGA Controller Symbol

      Pin description

      The pin description is shown in the below table

      Table VGA Pin description

      Name Type Polarity Bus size Description

      Global Control Signals

      CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

      17

      640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

      RST I High Global system reset

      VGA Input Settings Signals

      RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

      CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

      DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

      DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

      Data Memory Control Signals

      RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

      18

      the internal line enable signal en generated by the Synchronization Unit of the Controller

      DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

      ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

      VGA Monitor Control Signals

      HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

      VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

      19

      VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

      R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

      G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

      S

      B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

      20

      9 VGA SIGNAL TIMING

      Figure 3 summarizes the signal timing involved in sending a line of

      pixels and a frame of lines The actual time values differ according to the

      resolution selected (640x480 or 800x600) the processor-defined values for

      DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

      (CLK)

      21

      91 VGA PIXEL UNIT

      The Pixel Unit provides access to the pixilated image stored in external

      picture memory reading in data a byte at a time and formatting each pixel to

      be displayed For each pixel the 6-bit RGB value required for the control of

      the monitorrsquos three primary color electron guns is generated so that the pixel

      is displayed on the screen with the correct color

      Accessing the stored image

      The image to be displayed on the monitor screen is written by the host

      microcontroller into external memory space (RAM) This memory space can

      be located anywhere (eg using a block of RAM within the FPGA design or

      using the SRAM on the NanoBoard itself)

      Pixel data is stored in the memory space in bytes The number of pixels in a

      byte depends on which color palette is being used for the image (selected by

      the CMOD input)

      bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

      bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

      bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

      (DATA[76] are not used)

      22

      The size of memory required to store a picture is determined by the total

      number of viewable pixels in a line (determined by DISPSIZE_H) the total

      number of viewable lines in a frame (determined by DISPSIZE_V) and the

      number of pixels stored in each byte in memory space

      Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

      The address in RAM where the next pixel is stored is determined using

      an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

      reflects the current contents of the MEMADDR register The exact addressing

      is described below

      92 Black amp White mode

      The picture memory address ndash the byte of data containing the next 8 pixels

      ndash is determined by using bits 183 of hvcnt and right shifting the contents by

      three

      ADDR_PIXEL = 00 amp hvcnt[183]

      16 Colors mode

      The picture memory address ndash the byte of data containing the next 2 pixels

      ndash is determined by using bits 181 of hvcnt and right shifting the contents by

      one

      ADDR_PIXEL = 0 amp hvcnt[181]

      64 Colors mode

      The picture memory address ndash the byte of data containing the next pixel ndash is

      determined by using the full value of hvcnt

      ADDR_PIXEL = hvcnt[180]

      The Pixel register (PIXREG)

      The Pixel register is used to receive the byte of pixel data read from the

      current address in memory space The register PIXREG is reset to zero when

      the VGA Controller receives an external reset signal (RST)

      23

      The register is updated on each rising edge of the CLK signal Data can be

      read from the memory space as long as the RD signal is active (High) The RD

      signal is itself controlled by the external line display enable signal enex This

      internally generated signal is defined as follows

      If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

      (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

      Else

      enex = 1

      When enex is Low read access from memory is disabled (RD = 0)

      The point at which data is loaded from memory into PIXREG depends on the

      particular color palette that is chosen ndash Black amp White 16 Colors or 64

      Colors

      Black amp White mode

      The next byte of data will be loaded into the register whenever the

      lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

      are ldquo000rdquo

      For the currently loaded byte the active pixel is always in the lowest bit

      position of the Pixel register Each pixel in the data byte is moved into this

      active pixel position by shifting the contents of the register right by one bit on

      each rising edge of CLK

      16 Colors mode

      The next byte of data will be loaded into the register whenever the

      lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

      0

      For the currently loaded byte the active pixel is always in the low order

      nibble of the Pixel register Remember that in this mode each byte of data

      contains two pixels The second pixel is moved into this active pixel position

      by shifting the contents of the register right by four bits on the rising edge of

      CLK

      24

      64 Colors mode

      The next byte of data will be loaded into the register on the rising edge

      of the external system clock signal (CLK) In this mode the read of pixel data

      does not depend on the status of the hvcnt signal received from the

      Synchronization Unit

      93 The RGB register (RGBREG)

      The RGB register is used to store the six bits that are required for

      driving the red green and blue color guns of the target monitor When the

      chosen color palette is either Black amp White or 16 Colors these six bits are

      obtained by mapping the value of the active pixel to a predefined RGB code

      When the chosen color palette is 64 Colors the actual pixel value is used

      directly

      The register RGBREG is reset to zero (000000) when the VGA

      Controller receives an external reset signal (RST) This RGB code represents

      black

      The register is updated on each rising edge of the CLK signal and the

      value loaded is dependent on the state of the line display enable signal en

      When en is Low blanking is required and RGBREG is loaded with the code

      for black (000000)

      Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

      and a 4-bit pixel (16 Colors mode) into the required RGB color code

      25

      26

      27

      The RGB color code stored in the RGB register is output from the VGA

      Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

      and B1)

      The monitor itself expects analog signals as inputs to its electron gun control

      circuits This is achieved by using 2-bit digital to analog converter circuitry

      located on the Nano Board itself as shown in Figure 4

      Figure 91 Digital to Analog RGB conversion

      28

      For each color the 2-bit digital signal from the VGA Controller can be

      converted into 4 distinct analog levels These levels specify the intensity of

      each of the three primary colors to use when displaying the pixel on the

      monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

      brightness)

      With each analog input being one of four possible levels the monitor

      can display each pixel on the screen with one of 64 different color

      permutations

      Using Block RAM in Spartan-3E Generation

      FPGAs

      For applications requiring large on-chip memories Spartantrade-3

      Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

      Using various configuration options Select RAM blocks create RAM ROM

      FIFOs large look-up tables data width converterscircular buffers and shift

      registers each supporting various data widths and depths This application note

      describes the features and capabilities of block Select RAM and illustrates how

      to specify the various options using the Xilinx CORE Generatortrade system or

      via VHDL or Verilog instantiation

      29

      Table 2 Block RAM Available in Spartan-3E Devices

      Devi

      ce

      R

      A

      M

      Co

      lu

      m

      ns

      R

      A

      M

      Bl

      oc

      ks

      pe

      r

      C

      ol

      u

      m

      n

      T

      o

      t

      a

      l

      R

      A

      M

      B

      l

      o

      c

      k

      s

      T

      o

      t

      a

      l

      R

      A

      M

      B

      it

      s

      T

      o

      t

      a

      l

      R

      A

      M

      K

      b

      i

      t

      s

      XC3

      S100

      E

      1 4 4 7

      3

      7

      2

      8

      7

      2

      XC3

      S250

      E

      2 6 1

      2

      2

      2

      1

      1

      8

      4

      2

      1

      6

      XC3

      S500

      E

      2 10 2

      0

      3

      6

      8

      6

      4

      3

      6

      0

      30

      0

      XC3

      S120

      0E

      2 14 2

      8

      5

      1

      6

      0

      9

      6

      5

      0

      4

      XC3

      S160

      0E

      2 18 3

      6

      6

      6

      3

      5

      5

      2

      6

      4

      8

      Each block RAM contains 18432 bits of fast static RAM 16K bits of

      which is allocated to data storage and in some memory configurations an

      additional 2K bits allocated to parity or additional plus data bits Physically

      the block RAM memory has two completely independent access ports labeled

      Port A and Port B The structure is fully symmetrical and both ports are

      interchangeable and both ports support data read and write operations Each

      memory port is synchronous with its own clock clock enable and write

      enable Read operations are also synchronous and require a clock edge and

      clock enable Though physically a dual-port memory block RAM simulates

      single-port memory in an application as shown in Figure 1 Furthermore each

      block memory supports multiple configurations or aspect ratios Table 3

      summarizes the essential SelectRAM features Cascade multiple block RAMs

      to create deeper and wider memory organizations with a minimal timing

      penalty incurred through specialized routing resources

      31

      32

      10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

      Why (V) HDL

      Interoperability

      Technology independence

      Design reuse

      Several levels of abstraction

      Readability

      Standard language

      Widely supported

      What is VHDL

      VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

      IC)

      Design specification language

      Design entry language

      Design simulation language

      Design documentation language

      An alternative to schematics

      101 Brief History

      VHDL Was developed in the early 1980s for managing design problems that

      involved large circuits and multiple teams of engineers

      Funded by US Department of Defence

      33

      The first publicly available version was released in 1985

      In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

      presented with a proposal to standardize the VHDL

      In 1987 standardization =gt IEEE 1076-1987

      An improved version of the language was released in 1994 =gt IEEE

      standard1076-1993

      102 Related Standards

      IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

      impedance

      Soon after IEEE 1076-1987 was released simulator companies began using

      their own non-standard types =gt VHDL was becoming a nonstandard

      IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

      definitions for a nine-valued data type std_logic

      IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

      to actual hardware

      Defines eg two numeric types signed and unsigned

      VHDL Environment

      Design Units

      34

      Segments of VHDL code that can be compiled separately and stored in a

      library

      Entities

      A black box with interface definition

      Defines the inputsoutputs of a component (define pins)

      A way to represent modularity in VHDL

      Similar to symbol in schematic

      Entity declaration describes entity

      Eg

      entity Comparator is

      port (A B in std_logic_vector(7 downto0)

      EQ out std_logic)

      end Comparator

      Ports Provide channels of communication between the component and its

      environment

      Each port must have a name direction and a type

      An entity may have NO port declaration

      35

      Port directions In A value of a port can be read inside the component but cannot be assigned

      Multiple reads of port are allowed

      Out Assignments can be made to a port but data from a port cannot be read

      Multiple assignments are allowed

      In out Bi-directional assignments can be made and data can be read Multiple

      assignments are allowed

      Buffer An out port with read capability May have at most one assignment

      (are not recommended)

      Architectures Every entity has at least one architecture

      One entity can have several architectures

      Architectures can describe design using

      BehaviorndashStructurendashDataflow

      Architectures can describe design on many levelsndashGate levelndashRTL (Register

      Transfer Level)ndashBehavioral level

      Configuration declaration links architecture to entity

      Eg

      Architecture Comparator1 of Comparator is

      Begin

      EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

      End Comparator1

      Configurations Links entity declaration and architecture body together

      Concept of default configuration is a bit messy in VHDL lsquo87

      ndashLast architecture analyzed links to entity

      Can be used to change simulation behavior without re-analyzing the VHDL

      source

      Complex configuration declarations are ignored in synthesis

      36

      Some entities can have eggate level architecture and behavioral

      architecture

      Are always optional

      PackagesPackages contain information common to many design units

      1 Package declaration

      --constant declarations

      ndashtype and subtype declarations

      ndashfunction and procedure declarations

      ndashglobal signal declarations

      ndashfile declarations

      ndashcomponent declarations

      2 Package body

      ndashis not necessary needed

      ndashfunction bodies

      ndashprocedure bodies

      Packages are meant for encapsuling data which can be shared globally among

      Several design units

      Consists of declaration part and optional body part

      Package declaration can contain

      ndashtype and subtype declarations

      ndashsubprograms

      ndashconstants

      Alias declarations

      ndashglobal signal declarations

      ndashfile declarations

      ndashcomponent declarations

      37

      Package body consists of

      ndashsubprogram declarations and bodies

      ndashtype and subtype declarations

      ndash deferred constants

      ndash file declarations

      LibrariesCollection of VHDL design units (database)

      1 Packages

      package declaration

      package body

      2 Entities (entity declaration)

      3 Architectures (architecture body)

      4 Configurations (configuration declarations)

      Usually directory in UNIX file system

      Can be also any other kind of database

      Levels of Abstraction

      VHDL supports many possible styles of design description which differ

      primarily in how closely they relate to the HW

      It is possible to describe a circuit in a number of ways

      Structural-------

      Dataflow ------- Higher level of abstraction

      Behavioral -------

      Structural VHDL description

      Circuit is described in terms of its components

      From a low-level description (eg transistor-level description) to a high level

      description (eg block diagram)

      For large circuits a low-level description quickly becomes impractical

      103 Dataflow VHDL Description

      38

      Circuit is described in terms of how data moves through the system

      In the dataflow style you describe how information flows between registers in

      the system

      The combinational logic is described at a relatively high level the placement

      and operation of registers is specified quite precisely

      The behavior of the system over the time is defined by registers

      There are no build-in registers in VHDL-language

      ndashEither lower level description

      ndashor behavioral description of sequential elements is needed

      The lower level register descriptions must be created or obtained

      If there is no 3rd party models for registers =gt you must write the behavioral

      description of registers

      The behavioral description can be provided in the form of

      subprograms(functions or procedures)

      104 Behavioral VHDL Description

      Circuit is described in terms of its operation over time

      39

      Representation might include eg state diagrams timing diagrams and

      algorithmic descriptions

      The concept of time may be expressed precisely using delays (eg A lt= B

      after 10 ns)

      If no actual delay is used order of sequential operations is defined

      In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

      timing specifications

      The actual timing results depend on implementation technology and efficiency

      of synthesis tool

      There are a few tools for behavioral synthesis

      Concurrent Vs Sequential

      Processes

      Basic simulation concept in VHDL

      VHDL description can always be broken up to interconnected processes

      Quite similar to Unix process

      40

      Process keyword in VHDL

      Process statement is concurrent statement

      Statements inside process statements are sequential statements

      Process must contain either sensitivity list or wait statement(s) but NOT both

      Sensitivity list or wait statement(s) contains signals which wakes process up

      General format

      Process [(sensitivity_list)]

      process_declarative_part

      Begin

      process_statements

      [wait_statement]

      End process

      41

      11 VGA CONTROLLER CODE

      library IEEE

      use IEEESTD_LOGIC_1164ALL

      use IEEESTD_LOGIC_ARITHALL

      use IEEESTD_LOGIC_UNSIGNEDALL

      use IEEEnumeric_stdALL

      ---- Uncomment the following library declaration if instantiating

      ---- any Xilinx primitives in this code

      --library UNISIM

      --use UNISIMVComponentsall

      entity vga_controller is port(clk50min std_logic

      rstnin std_logic

      hsyncout std_logic

      vsyncout std_logic

      redout std_logic_vector(0 downto 0)

      greenout std_logic_vector(0 downto 0)

      blueout std_logic_vector(0 downto 0))

      end vga_controller

      architecture Behavioral of vga_controller is

      component memory

      port (

      addr IN std_logic_VECTOR(17 downto 0)

      clk IN std_logic

      dout OUT std_logic_VECTOR(0 downto 0)

      en IN std_logic

      sinit IN std_logic)

      END component

      42

      component sync_unit port(

      clkin std_logic

      rstnin std_logic

      hsyncout std_logic

      ---horizontal synch pulse(row)

      vsyncout std_logic

      --vertical sync pulse(frame) 111110100000000000

      v_en_regout std_logic

      h_enout std_logic

      v_enout std_logic

      line_countinout integer

      c_hsinout integer)

      end component

      component pixelgen port(

      clkin std_logic

      rstnin std_logic

      hsyncin std_logic

      vsyncin std_logic

      h_enin std_logic

      v_enin std_logic

      v_en_reg in std_logic

      data_inin std_logic_vector(0 downto 0)

      line_countin integer

      c_hsin integer

      red out std_logic_vector(0 downto 0)

      greenout std_logic_vector(0 downto 0)

      blueout std_logic_vector(0 downto 0)

      addrout std_logic_vector(17 downto 0)

      rdout std_logic)

      end component

      43

      signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

      signal line_count_sc_hs_saddrinteger

      signal data_sstd_logic_vector(0 downto 0)

      signal addr_sstd_logic_vector(17 downto 0)

      begin

      --addr_s lt= std_logic_vector(to_unsigned(addr18))

      process(clk50mrstn)

      begin

      if(rstn = 0)then

      clk lt= 0

      elsif(clk50m = 1 and clk50mevent)then

      clk lt= not clk

      end if

      --end if

      end process

      hsync lt= hsync_s

      vsync lt= vsync_s

      sync sync_unit port map(

      clk =gt clk

      rstn =gt rstn

      hsync =gt hsync_s

      vsync =gt vsync_s

      v_en_reg =gt v_en_reg_s

      line_count =gt line_count_s

      h_en =gt h_en_s

      v_en =gt v_en_s

      c_hs =gt c_hs_s

      )

      44

      pixel pixelgen port map(

      clk =gt clk

      rstn =gt rstn

      hsync =gt hsync_s

      vsync =gt vsync_s

      h_en =gt h_en_s

      v_en =gt v_en_s

      v_en_reg =gt v_en_reg_s

      data_in =gt data_s

      line_count=gt line_count_s

      c_hs =gt c_hs_s

      red =gt red

      green =gt green

      blue =gt blue

      addr =gt addr_s

      rd =gt rd_s

      )

      mem12 memory port map(

      addr =gt addr_s

      clk =gt clk

      dout =gt data_s

      en =gt rd_s

      sinit=gt rstn

      )

      end Behavioral

      45

      SYNCHRONIZATION BLOCK CODE

      library IEEE

      use IEEESTD_LOGIC_1164ALL

      use IEEESTD_LOGIC_ARITHALL

      use IEEESTD_LOGIC_UNSIGNEDALL

      entity sync_unit is port(

      clkin std_logic

      rstnin std_logic

      hsyncout std_logic ---horizontal synch pulse(row)

      vsyncout std_logic --vertical sync

      pulse(frame) 111110100000000000

      v_en_regout std_logic

      h_enout std_logic

      v_enout std_logic

      line_countout integer

      c_hsout integer )

      end sync_unit

      architecture arch_sync_unit of sync_unit is

      signal h_en_sv_en_sstd_logic

      signal line_count_sc_hs_sinteger

      begin

      c_hs lt= c_hs_s

      line_count lt= line_count_s

      h_en lt= h_en_s

      v_en lt= v_en_s

      -- hsync counter

      process(clkrstn)

      begin

      if(rstn = 0)then

      46

      c_hs_s lt= 0

      else

      if(clk = 1 and clkevent)then

      if(c_hs_s=793)then

      c_hs_s lt= 0

      else

      c_hs_s lt= c_hs_s +1

      end if

      end if

      end if

      end process

      --vsync counter

      --vertical line counter

      process(h_en_srstn)

      begin

      if(rstn = 0)then

      line_count_s lt= 0

      elsif(h_en_s = 1 and h_en_sevent)then

      line_count_s lt= line_count_s + 1

      if (line_count_s=515)then

      line_count_s lt= 0

      end if

      end if

      --end if

      end process

      --hysnc pulse

      process(clkrstn)

      begin

      if(rstn = 0)then

      47

      hsync lt= 1

      elsif(clk = 1 and clkevent)then

      if(c_hs_s lt=95)then

      hsync lt= 0

      else

      hsync lt= 1

      end if

      end if

      --end if

      end process

      process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

      this period)

      begin

      if(rstn = 0)then

      h_en_s lt= 0

      elsif(clk = 1 and clkevent)then

      if((c_hs_s gt=143) and (c_hs_slt=778))then

      h_en_s lt= 1

      else

      h_en_s lt= 0

      end if

      end if

      --end if

      end process

      ---vysnc pulse

      process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

      begin

      if(rstn = 0)then

      48

      v_en_s lt= 0

      elsif(clk = 1 and clkevent)then

      if((line_count_sgt=35 and line_count_slt=515))then

      v_en_s lt= 1

      else

      v_en_s lt= 0

      end if

      end if

      --end if

      end process

      process(clkrstn) --Vertical Sync Front porch Back Porch

      begin

      if(rstn = 0)then

      vsync lt= 1

      elsif(clk = 1 and clkevent)then

      if(line_count_s lt2)then

      vsync lt= 0

      else

      vsync lt= 1

      end if

      end if

      --end if

      end process

      process(clk)

      begin

      if(clk = 1 and clkevent)then

      v_en_reglt=v_en_s

      end if

      end process

      end arch_sync_unit

      49

      PIXEL GENERATOR

      library IEEE

      use IEEESTD_LOGIC_1164ALL

      use IEEESTD_LOGIC_ARITHALL

      use IEEESTD_LOGIC_UNSIGNEDALL

      entity pixelgen is port(

      clkin std_logic

      rstnin std_logic

      hsyncin std_logic

      vsyncin std_logic

      h_enin std_logic

      v_enin std_logic

      v_en_reg in std_logic

      data_inin std_logic_vector(0 downto 0)

      line_countin integer

      c_hsin integer

      red out std_logic_vector(0 downto 0)

      greenout std_logic_vector(0 downto 0)

      blueout std_logic_vector(0 downto 0)

      addrout std_logic_vector(17 downto 0)

      rdout std_logic

      )

      end pixelgen

      architecture behv of pixelgen is

      signal addr_sstd_logic_vector(17 downto 0)

      begin

      addr lt= addr_s

      process(clkrstn)

      begin

      50

      if(rstn = 0)then

      rd lt= 0

      red lt= 0

      green lt= 0

      blue lt= 0

      addr_s lt= x0000 amp 00

      elsif (clk = 1 and clkevent)then

      if((v_en=1) and (v_en_reg=0))then

      addr_slt=(others =gt0)

      end if

      if(v_en=1 and h_en=1)then

      if(line_countlt=514)then

      if(c_hs lt= 655)then

      rd lt= 1

      red lt= data_in

      green lt= data_in

      blue lt= data_in

      addr_s lt=addr_s +000000000000000001

      else

      rd lt= 0

      red lt= 0

      green lt= 1

      blue lt= 0

      end if

      else

      rd lt= 0

      red lt= 1

      green lt= 1

      blue lt= 1

      51

      end if

      if(addr_s=111011111111111111)then

      addr_slt= (others =gt0)

      end if

      else

      red lt= 0

      green lt= 0

      blue lt= 0

      end if

      end if

      -- end if

      --end if

      end process

      end behv

      52

      12 FPGA ARCHITECTURE

      A field-programmable gate array (FPGA) is a semiconductor device that can

      be configured by the customer or designer after manufacturingmdashhence the

      name field-programmable FPGAs are programmed using a logic circuit

      diagram or a source code in a hardware description language (HDL) to specify

      how the chip will work They can be used to implement any logical function

      that an application-specific integrated circuit (ASIC) could perform but the

      ability to update the functionality after shipping offers advantages for many

      applications

      FPGAs contain programmable logic components called logic blocks and a

      hierarchy of reconfigurable interconnects that allow the blocks to be wired

      togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

      can be configured to perform complex combinational functions or merely

      simple logic gates like AND and XOR In most FPGAs the logic blocks also

      include memory elements which may be simple flip-flops or more complete

      blocks of memory

      History The FPGA industry sprouted from programmable read only

      memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

      both had the option of being programmed in batches in a factory or in the field

      (field programmable) however programmable logic was hard-wired between

      logic gates

      Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

      first commercially viable field programmable gate array in 1985 ndash the

      XC2064 The XC2064 had programmable gates and programmable

      interconnects between gates the beginnings of a new technology and market

      The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

      input lookup tables (LUTs) More than 20 years later Freeman was entered

      into the National Inventors Hall of Fame for his invention

      53

      Some of the industryrsquos foundational concepts and technologies for

      programmable logic arrays gates and logic blocks are founded in patents

      awarded to David W Page and LuVerne R Peterson in 1985

      In the late 1980s the Naval Surface Warfare Department funded an experiment

      proposed by Steve Casselman to develop a computer that would implement

      600000 reprogrammable gates Casselman was successful and the system was

      awarded a patent in 1992

      Xilinx continued unchallenged and quickly growing from 1985 to the mid-

      1990s when competitors sprouted up eroding significant market-share By

      1993 Actel was serving about 18 percent of the market

      The 1990s were an explosive period of time for FPGAs both in sophistication

      and the volume of production In the early 1990s FPGAs were primarily used

      in telecommunications and networking By the end of the decade FPGAs

      found their way into consumer automotive and industrial applications

      FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

      algorithm technology and FPGAs to create a sound recognition device

      Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

      to decide the configuration needed to accomplish a sound recognition task

      121 Modern developments

      A recent trend has been to take the coarse-grained architectural

      approach a step further by combining the logic blocks and interconnects of

      traditional FPGAs with embedded microprocessors and related peripherals to

      form a complete system on a programmable chip This work mirrors the

      architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

      Group which combined a reconfigurable CPU architecture on a single chip

      called the SB24 That work was done in 1982 Examples of such hybrid

      technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

      which include one or more PowerPC processors embedded within the FPGAs

      54

      logic fabric The Atmel FPSLIC is another such device which uses an AVR

      processor in combination with Atmels programmable logic architecture

      An alternate approach to using hard-macro processors is to make use of soft

      processor cores that are implemented within the FPGA logic (See Soft

      processors below)

      As previously mentioned many modern FPGAs have the ability to be

      reprogrammed at run time and this is leading to the idea of reconfigurable

      computing or reconfigurable systems mdash CPUs that reconfigure themselves to

      suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

      example of a reconfigurable soft processor implemented on FPGAs However

      it does not support dynamic reconfiguration at runtime but instead adapts itself

      to a specific program

      Additionally new non-FPGA architectures are beginning to emerge

      Software-configurable microprocessors such as the Stretch S5000 adopt a

      hybrid approach by providing an array of processor cores and FPGA-like

      programmable cores on the same chip

      Gates

      1987 9000 gates Xilinx

      1992 600000 Naval Surface Warfare Department

      Early 2000s Millions

      Market size

      1985 First commercial FPGA technology invented by Xilinx

      1987 $14 million

      ~1993 gt$385 million

      2005 $19 billion

      2010 estimates $275 billion

      55

      122 FPGA Comparisons

      Historically FPGAs have been slower less energy efficient and generally

      achieved less functionality than their fixed ASIC counterparts A combination

      of volume fabrication improvements research and development and the IO

      capabilities of new supercomputers have largely closed the performance gap

      between ASICs and FPGAs

      Advantages include a shorter time to market ability to re-program in the field

      to fix bugs and lower non-recurring engineering costs Vendors can also take a

      middle road by developing their hardware on ordinary FPGAs but

      manufacture their final version so it can no longer be modified after the design

      has been committed

      Xilinx claims that several market and technology dynamics are changing the

      ASICFPGA paradigm

      IC costs are rising aggressively

      ASIC complexity has bolstered development time and costs

      RampD resources and headcount is decreasing

      Revenue losses for slow time-to-market are increasing

      Financial constraints in a poor economy are driving low-cost technologies

      These trends make FPGAs a better alternative than ASICs for a growing

      number of higher-volume applications than they have been historically used

      for which the company blames for the growing number of FPGA design starts

      (see History)

      The primary differences between CPLDs and FPGAs are architectural A

      CPLD has a somewhat restrictive structure consisting of one or more

      programmable sum-of-products logic arrays feeding a relatively small number

      of clocked registers The result of this is less flexibility with the advantage of

      more predictable timing delays and a higher logic-to-interconnect ratio The

      FPGA architectures on the other hand are dominated by interconnect This

      56

      makes them far more flexible (in terms of the range of designs that are

      practical for implementation within them) but also far more complex to design

      for

      Another notable difference between CPLDs and FPGAs is the presence in most

      FPGAs of higher-level embedded functions (such as adders and multipliers)

      and embedded memories as well as to have logic blocks implement decoders

      or mathematical functions

      Some FPGAs have the capability of partial re-configuration that lets one

      portion of the device be re-programmed while other portions continue running

      123 Applications

      Applications of FPGAs include digital signal processing software-defined

      radio aerospace and defense systems ASIC prototyping medical imaging

      computer vision speech recognition cryptography bioinformatics computer

      hardware emulation radio astronomy and a growing range of other areas

      FPGAs originally began as competitors to CPLDs and competed in a similar

      space that of glue logic for PCBs As their size capabilities and speed

      increased they began to take over larger and larger functions to the state where

      some are now marketed as full systems on chips (SoC) Particularly with the

      introduction of dedicated multipliers into FPGA architectures in the late 1990s

      applications which had traditionally been the sole reserve of DSPs began to

      incorporate FPGAs instead

      FPGAs especially find applications in any area or algorithm that can make use

      of the massive parallelism offered by their architecture One such area is code

      breaking in particular brute-force attack of cryptographic algorithms

      FPGAs are increasingly used in conventional high performance computing

      applications where computational kernels such as FFT or Convolution are

      performed on the FPGA instead of a microprocessor

      57

      The inherent parallelism of the logic resources on an FPGA allows for

      considerable computational throughput even at a low MHz clock rates The

      flexibility of the FPGA allows for even higher performance by trading off

      precision and range in the number format for an increased number of parallel

      arithmetic units This has driven a new type of processing called

      reconfigurable computing where time intensive tasks are offloaded from

      software to FPGAs

      The adoption of FPGAs in high performance computing is currently limited by

      the complexity of FPGA design compared to conventional software and the

      extremely long turn-around times of current design tools where 4-8 hours wait

      is necessary after even minor changes to the source code

      Traditionally FPGAs have been reserved for specific vertical applications

      where the volume of production is small For these low-volume applications

      the premium that companies pay in hardware costs per unit for a

      programmable chip is more affordable than the development resources spent

      on creating an ASIC for a low-volume application Today new cost and

      performance dynamics have broadened the range of viable applications

      Architecture

      The most common FPGA architecture consists of an array of configurable

      logic blocks (CLBs) IO pads and routing channels Generally all the routing

      channels have the same width (number of wires) Multiple IO pads may fit

      into the height of one row or the width of one column in the array

      An application circuit must be mapped into an FPGA with adequate resources

      While the number of CLBs and IOs required is easily determined from the

      design the number of routing tracks needed may vary considerably even

      among designs with the same amount of logic (For example a crossbar switch

      requires much more routing than a systolic array with the same gate count)

      Since unused routing tracks increase the cost (and decrease the performance)

      of the part without providing any benefit FPGA manufacturers try to provide

      58

      just enough tracks so that most designs that will fit in terms of LUTs and IOs

      can be routed This is determined by estimates such as those derived from

      Rents rule or by experiments with existing designs

      The FPGA is an array or island-style FPGA It consists of an array of logic

      blocks and routing channels Two IO pads fit into the height of one row or the

      width of one column as shown below All the routing channels have the same

      width (number of wires)

      121 FPGA structure

      A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

      flip-flop as shown below In recent years manufacturers have started moving

      to 6-input LUTs in their high performance parts claiming increased

      performance

      Typical logic block

      59

      There is only one output which can be either the registered or the unregistered

      LUT output The logic block has four inputs for the LUT and a clock input

      Since clock signals (and often other high-fanout signals) are normally routed

      via special-purpose dedicated routing networks in commercial FPGAs they

      and other signals are separately managed

      For this example architecture the locations of the FPGA logic block pins are

      shown below

      Logic Block Pin Locations

      Each input is accessible from one side of the logic block while the output pin

      can connect to routing wires in both the channel to the right and the channel

      below the logic block

      Each logic block output pin can connect to any of the wiring segments in the

      channels adjacent to it

      Similarly an IO pad can connect to any one of the wiring segments in the

      channel adjacent to it For example an IO pad at the top of the chip can

      connect to any of the W wires (where W is the channel width) in the horizontal

      channel immediately below it

      Generally the FPGA routing is unsegmented That is each wiring segment

      spans only one logic block before it terminates in a switch box By turning on

      60

      some of the programmable switches within a switch box longer paths can be

      constructed For higher speed interconnect some FPGA architectures use

      longer routing lines that span multiple logic blocks

      Whenever a vertical and a horizontal channel intersect there is a switch box In

      this architecture when a wire enters a switch box there are three

      programmable switches that allow it to connect to three other wires in adjacent

      channel segments The pattern or topology of switches used in this

      architecture is the planar or domain-based switch box topology In this switch

      box topology a wire in track number one connects only to wires in track

      number one in adjacent channel segments wires in track number 2 connect

      only to other wires in track number 2 and so on The figure below illustrates

      the connections in a switch box

      Switch box topology

      Modern FPGA families expand upon the above capabilities to include higher

      level functionality fixed into the silicon Having these common functions

      embedded into the silicon reduces the area required and gives those functions

      increased speed compared to building them from primitives Examples of these

      include multipliers generic DSP blocks embedded processors high speed IO

      logic and embedded memories

      61

      FPGAs are also widely used for systems validation including pre-silicon

      validation post-silicon validation and firmware development This allows chip

      companies to validate their design before the chip is produced in the factory

      reducing the time to market

      124 FPGA Design and Programming

      To define the behavior of the FPGA the user provides a hardware description

      language (HDL) or a schematic design The HDL form might be easier to work

      with when handling large structures because its possible to just specify them

      numerically rather than having to draw every piece by hand On the other hand

      schematic entry can allow for easier visualisation of a design

      Then using an electronic design automation tool a technology-mapped netlist

      is generated The netlist can then be fitted to the actual FPGA architecture

      using a process called place-and-route usually performed by the FPGA

      companys proprietary place-and-route software The user will validate the

      map place and route results via timing analysis simulation and other

      verification methodologies Once the design and validation process is

      complete the binary file generated (also using the FPGA companys

      proprietary software) is used to (re)configure the FPGA

      Going from schematicHDL source files to actual configuration The source

      files are fed to a software suite from the FPGACPLD vendor that through

      different steps will produce a file This file is then transferred to the

      FPGACPLD via a serial interface (JTAG) or to an external memory device

      like an EEPROM

      The most common HDLs are VHDL and Verilog although in an attempt to

      reduce the complexity of designing in HDLs which have been compared to the

      equivalent of assembly languages there are moves to raise the abstraction level

      through the introduction of alternative languages

      62

      To simplify the design of complex systems in FPGAs there exist libraries of

      predefined complex functions and circuits that have been tested and optimized

      to speed up the design process These predefined circuits are commonly called

      IP cores and are available from FPGA vendors and third-party IP suppliers

      (rarely free and typically released under proprietary licenses) Other

      predefined circuits are available from developer communities such as Open

      Cores (typically free and released under the GPL BSD or similar license) and

      other sources

      In a typical design flow an FPGA application developer will simulate the

      design at multiple stages throughout the design process Initially the RTL

      description in VHDL or Verilog is simulated by creating test benches to

      simulate the system and observe results Then after the synthesis engine has

      mapped the design to a netlist the netlist is translated to a gate level

      description where simulation is repeated to confirm the synthesis proceeded

      without errors Finally the design is laid out in the FPGA at which point

      propagation delays can be added and the simulation run again with these

      values back-annotated onto the netlist

      Basic Process Technology Types

      SRAM - based on static memory technology In-system programmable and re-

      programmable Requires external boot devices CMOS

      Antifuse - One-time programmable CMOS

      EPROM - Erasable Programmable Read-Only Memory technology Usually

      one-time programmable in production because of plastic packaging

      Windowed devices can be erased with ultraviolet (UV) light CMOS

      EEPROM - Electrically Erasable Programmable Read-Only Memory

      technology Can be erased even in plastic packages Some but not all

      EEPROM devices can be in-system programmed CMOS

      Flash - Flash-erase EPROM technology Can be erased even in plastic

      packages Some but not all flash devices can be in-system programmed

      63

      Usually a flash cell is smaller than an equivalent EEPROM cell and is

      therefore less expensive to manufacture CMOS

      Fuse - One-time programmable Bipolar

      Major Manufacturers

      Xilinx and Altera are the current FPGA market leaders and long-time industry

      rivals Together they control over 80 percent of the market with Xilinx alone

      representing over 50 percent

      Xilinx also provides free Windows and Linux design software while Altera

      provides free Windows tools the Solaris and Linux tools are only available via

      a rental scheme

      Other competitors include Lattice Semiconductor (flash SRAM) Actel

      (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

      Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

      focused CSSP no general purpose FPGAs)

      125 FPGA prototype

      FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

      prototyping is the method to prototype SoC and ASIC design on FPGA for

      hardware verification and early software development

      Main stream verification methods for hardware design and early software and

      firmware co-design has become mainstream Prototyping SoC and ASIC

      design on FPGA has become a good method to do this

      64

      Reasons why Prototyping is important

      1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

      functionally correct This is compared to designers only relying on software

      simulations to verify that their hardware design is sound Simulation speed and

      modeling accuracy limitations hinder this development

      2 Due to time constrains many projects cannot wait until the silicon is back from

      the foundry to start on software tests FPGA prototyping allows for much more

      time in area of software development and testing at the software-hardware

      integration stage This allows many unforeseen software bugs that appear due

      to todays array of operating systems applications and hardware

      3 Prototyping also allows the developer to ensure that all IP technologies on his

      system work well together off the simulation stage and in actual form

      4 Prototyping has the added advantage as demo platforms to SoC clients

      bringing in interest early This speeds up the overall development cycle and

      allows for more enhancement or improvement to the chip features as it would

      otherwise have been

      65

      13 SIMULATION RESULTS

      66

      GATE LEVEL

      FigVGA CONTROLLER

      67

      GATE LEVEL

      Fig SYNC UNIT

      68

      TECHNOLOGY SCHEMATIC

      69

      14 APPLICATIONS

      Motion Tracking Systems

      Image processing systems

      Displaying Systems

      141 CONCLUTION

      Hence by implementing FPGA taking a processor-generated picture

      (pixilated) from memory space the Controller provides digital RGB values for

      each pixel as well as horizontal and vertical synchronization signals in order

      to correctly display the picture on a connected monitor

      70

      REFERENCES

      1 httpenwikipediaorgwikiVideo_Graphics_Array

      2 A VGA display controller by Eduardo Sanchez

      3 HinnerrdquoVGA Interface and video signal documentsrdquo

      71

      • Pixel
        • 6 SIGNALS IN VGA CONTROLLER
        • Gates
        • Market size
        • 122 FPGA Comparisons

        3 DESCRIPTION ARCHITECTURE

        31 VGA display device

        VGA display device used for displaying the images taken from the

        processor in an exact format It can be used for displaying the output of any

        projects Analog graphs etchellip can be directly displayed on the VGA display

        monitor

        31 VGA (Video Graphics Array) Controller

        VGA controller controls the image to be exactly displayed on the display

        device It acts as an interface between processing system and display device so

        called as interface controller

        4

        Image or video processing system

        VGA Controller

        Display

        4 CONTROLLER DESIGN

        For this first we should know how the display device works Consider an

        image below

        640x480

        Original image

        5

        Pixels of image

        Display device also have pixels Display resolution (640480) is a

        standard resolution of display

        Pixel

        This article is about the picture element For other uses see Pixel

        (disambiguation)

        This example shows an image with a portion greatly enlarged in which the

        individual pixels are rendered as little squares and can easily be seen

        6

        42 A photograph of sub-pixel display elements on a laptops LCD screen

        In digital imaging a pixel (or picture element [1]) is a single point in a raster

        image The pixel is the smallest addressable screen element it is the smallest

        unit of picture which can be controlled Each Pixel has its address The address

        of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

        dimensional grid and are often represented using dots or squares Each pixel is

        a sample of an original image where more samples typically provide more-

        accurate representations of the original The intensity of each pixel is variable

        In color image systems a color is typically represented by three or four

        component intensities such as red green and blue or cyan magenta yellow

        and black

        Color refreshing rate

        Color refreshing rate =60Hz

        Color refreshing rate represents the number of frames that are transmitted per

        second

        Minimum number of frames that can be transmitted per second by VGA

        controller is 60

        7

        5 Hardware Description

        51 Block Diagram

        51 VGA Synchronization Unit

        8

        The Synchronization Unit provides the horizontal and vertical

        synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

        display a picture frame within the confines of a monitorrsquos display area

        These synchronization signals are used as control inputs by the

        monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

        electrons emitted by the three primary color electron guns (Red Green Blue)

        left to right and from top to bottom respectively HSYNC provides the start

        and stop times for the horizontal deflection circuit so that a line of pixels is

        correctly drawn across the screen display VSYNC provides the start and stop

        times for the vertical deflection circuit so that the lines of a frame are correctly

        drawn from the top to the bottom of the screen display

        The resolution for the display is defined by the level on the

        RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

        the 800x600 resolution (SVGA) is used

        Although the resolution determines the area of a monitorrsquos screen

        within which an image can be displayed the full extents of the chosen

        resolution do not have to be used The actual extents of the image display area

        on the screen can be controlled by the use of the DISPSIZE_H and

        DISPSIZE_V inputs These inputs determine the total number of pixels to be

        used in a line and the total number of lines in a frame respectively

        52 Blank pixel generation

        The total number of viewable pixels in a line and viewable lines in a

        frame is determined by the display resolution chosen through the

        RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

        the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

        chosen display resolution is used or not the areas of the monitor screen to the

        top bottom left and right of the viewable frame area are blanked by putting

        black pixels at the required line-pixel positions This has the effect of centering

        the image on the screen

        9

        The color generated for a pixel in the Pixel Unit depends on whether

        the particular pixel requires to be blanked or not The Synchronization Unit

        provides a signal to the Pixel Unit for this very reason This is the line display

        enable signal - en The signal is checked on each rising edge of the external

        clock signal (CLK) and is set as follows

        If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

        en = 0 (pixel requires to be blanked ndash set color to be black)

        Else

        en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

        6 SIGNALS IN VGA CONTROLLER

        For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

        timings are

        Parameter Value Unit

        Clock frequency 25175 MHz

        Horizontal pixels 640

        Horizontal sync polarity Negative

        Total time for each line 3177 micros

        Front porch (A) 094 micros

        Sync pulse length (B) 377 micros

        Back porch (C) 189 micros

        Active video (D) 2517 micros

        (Total horizontal sync time 660 micros)

        10

        61 The vertical timings are

        Parameter Value Unit

        Vertical lines 480

        Vertical sync polarity Negative

        Vertical frequency 5994 Hz

        Front porch (E) 035 ms

        Sync pulse length (F) 006 ms

        Back porch (G) 102 ms

        Active video (H) 1525 ms

        (Total vertical sync time 143 ms)

        640 x 400 70 Hz is video mode used for booting most x86 personal

        computers

        640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

        It should be noted that the actual timings vary slightly For example for

        640x480 60fps a 2517 micros active video time with a pixel frequency of

        25174 MHz gives 633 pixels rather than the expected 640 pixels

        62 Generation of the horizontal synchronization signal ndash HSYNC

        11

        The HSYNC signal is High (inactive) after an external reset signal

        (RST) is received by the VGA Controller The signal is updated on each rising

        edge of the external clock signal (CLK)

        The state of the HSYNC signal depends on the value stored in the horizontal

        counter and is driven low when

        HCOUNT ge (PixTotal + BlankingLeft)

        and remains low while

        HCOUNT lt (PixTotal + BlankingLeft + TLSync)

        Vertical (Frame) Period

        63 Generation of the vertical synchronization signal - VSYNC

        The VSYNC signal is High (inactive) after an external reset signal

        (RST) is received by the VGA Controller The signal is updated after every

        line of pixels is completed (ie on each rising edge of the HSYNC signal)

        The state of the VSYNC signal depends on the value stored in the vertical

        counter and is driven low when

        VCOUNT ge (LinTotal + BlankingLeft)

        and remains low while

        VCOUNT lt (LinTotal + BlankingLeft + TFSync)

        Address Counter (MEMADDR)

        The address counter is used to store the position of the next consecutive

        pixel in the frame Its value is passed to the Pixel Unit on the internal bus

        signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

        the next pixel from picture memory

        The counter is reset to zero when the VGA Controller receives an external

        reset signal (RST) The size of the counter depends on the values chosen for

        DISPSIZE_H and DISPSIZE_V as the range is simply

        12

        0 to (PixTotal x LinTotal) - 1

        Taking the maximum number of pixels in a line and lines in a frame for each

        of the supported display resolutions the maximum ranges for the counter are

        bull 640x480 0 to 307199

        bull 800x600 0 to 479999

        The counter has 19-bit resolution

        While the value in the horizontal counter (HCOUNT) is less than the

        total number of viewable pixels in a line (PixTotal the integer value of

        DISPSIZE_H) the counter is incremented on the rising edge of the external

        clock signal (CLK) Pixel addressing within the frame is consecutive When

        the counter reaches the last pixel in a line its incremented value is the first

        pixel in the next line down

        The address counter will continue to be incremented until the value in the

        vertical counter (VCOUNT) is greater than or equal to the total number of

        viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

        point it will be rolled over to zero

        13

        7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

        microcontroller and any VGA-compatible monitor This variant of the

        Controller provides six modes of display depending on the resolution chosen

        (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

        White 16 Colors or 64 Colors)

        71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

        For 1sec = 60 frames

        1 frame = max 160 sec

        1 pixel = 160 sec = 50ns

        307200

        Max Time for 1 pixel = 40ns

        2) To send information for each and every pixel of a frame by means of

        RGB signals

        3) After completely displaying 1 row information the column must be

        refreshed and the row has to be incremented ie done by Hsync signal

        4) After completing 1 frame of information a new frame has to be started

        for this row and column are to be refreshed ie done by Vsync signal

        14

        =gt For giving information for 1 pixel we use 1 clock (for controlling all

        operations)

        1 clock = 1 pixel

        1 clock cycle period = 40ns

        Clock frequency = 1time period =140ns = 25 mhz

        1 clock frequency = 25 mhz

        So if we use 25mhz clock and display 1 pixel information for each and enery

        clock then we use can attain min 60hz refreshing rate

        do not use the clock frequency less than 25mhz

        Ex

        15MHz 25MHz

        30MHz

        Time period = more effectively utilized frequency

        Less

        No of framessec = less

        More

        Decreases

        Increases

        (It is unable to display 60framessec)

        Signals used for image display-

        R (Red)

        15

        G (Green)

        B (Blue)

        HSYNC (Horizontal Synchronous signal)

        VSYNC (Vertical Synchronous signal)

        All the above signals are used to control the image data to be correctly

        displayed on the VGA display monitor

        72 Timings for front port sync pulse back port-

        Horizontal sync Vertical sync

        Front port 094micros Front port 035ms

        Sync pulse 377 micros Sync pulse 0063ms

        Back port 189 micros Back port 102ms

        16

        8 FUNCTIONAL DESCRIPTION81 Symbol

        VGA Controller Symbol

        Pin description

        The pin description is shown in the below table

        Table VGA Pin description

        Name Type Polarity Bus size Description

        Global Control Signals

        CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

        17

        640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

        RST I High Global system reset

        VGA Input Settings Signals

        RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

        CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

        DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

        DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

        Data Memory Control Signals

        RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

        18

        the internal line enable signal en generated by the Synchronization Unit of the Controller

        DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

        ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

        VGA Monitor Control Signals

        HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

        VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

        19

        VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

        R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

        G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

        S

        B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

        20

        9 VGA SIGNAL TIMING

        Figure 3 summarizes the signal timing involved in sending a line of

        pixels and a frame of lines The actual time values differ according to the

        resolution selected (640x480 or 800x600) the processor-defined values for

        DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

        (CLK)

        21

        91 VGA PIXEL UNIT

        The Pixel Unit provides access to the pixilated image stored in external

        picture memory reading in data a byte at a time and formatting each pixel to

        be displayed For each pixel the 6-bit RGB value required for the control of

        the monitorrsquos three primary color electron guns is generated so that the pixel

        is displayed on the screen with the correct color

        Accessing the stored image

        The image to be displayed on the monitor screen is written by the host

        microcontroller into external memory space (RAM) This memory space can

        be located anywhere (eg using a block of RAM within the FPGA design or

        using the SRAM on the NanoBoard itself)

        Pixel data is stored in the memory space in bytes The number of pixels in a

        byte depends on which color palette is being used for the image (selected by

        the CMOD input)

        bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

        bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

        bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

        (DATA[76] are not used)

        22

        The size of memory required to store a picture is determined by the total

        number of viewable pixels in a line (determined by DISPSIZE_H) the total

        number of viewable lines in a frame (determined by DISPSIZE_V) and the

        number of pixels stored in each byte in memory space

        Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

        The address in RAM where the next pixel is stored is determined using

        an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

        reflects the current contents of the MEMADDR register The exact addressing

        is described below

        92 Black amp White mode

        The picture memory address ndash the byte of data containing the next 8 pixels

        ndash is determined by using bits 183 of hvcnt and right shifting the contents by

        three

        ADDR_PIXEL = 00 amp hvcnt[183]

        16 Colors mode

        The picture memory address ndash the byte of data containing the next 2 pixels

        ndash is determined by using bits 181 of hvcnt and right shifting the contents by

        one

        ADDR_PIXEL = 0 amp hvcnt[181]

        64 Colors mode

        The picture memory address ndash the byte of data containing the next pixel ndash is

        determined by using the full value of hvcnt

        ADDR_PIXEL = hvcnt[180]

        The Pixel register (PIXREG)

        The Pixel register is used to receive the byte of pixel data read from the

        current address in memory space The register PIXREG is reset to zero when

        the VGA Controller receives an external reset signal (RST)

        23

        The register is updated on each rising edge of the CLK signal Data can be

        read from the memory space as long as the RD signal is active (High) The RD

        signal is itself controlled by the external line display enable signal enex This

        internally generated signal is defined as follows

        If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

        (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

        Else

        enex = 1

        When enex is Low read access from memory is disabled (RD = 0)

        The point at which data is loaded from memory into PIXREG depends on the

        particular color palette that is chosen ndash Black amp White 16 Colors or 64

        Colors

        Black amp White mode

        The next byte of data will be loaded into the register whenever the

        lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

        are ldquo000rdquo

        For the currently loaded byte the active pixel is always in the lowest bit

        position of the Pixel register Each pixel in the data byte is moved into this

        active pixel position by shifting the contents of the register right by one bit on

        each rising edge of CLK

        16 Colors mode

        The next byte of data will be loaded into the register whenever the

        lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

        0

        For the currently loaded byte the active pixel is always in the low order

        nibble of the Pixel register Remember that in this mode each byte of data

        contains two pixels The second pixel is moved into this active pixel position

        by shifting the contents of the register right by four bits on the rising edge of

        CLK

        24

        64 Colors mode

        The next byte of data will be loaded into the register on the rising edge

        of the external system clock signal (CLK) In this mode the read of pixel data

        does not depend on the status of the hvcnt signal received from the

        Synchronization Unit

        93 The RGB register (RGBREG)

        The RGB register is used to store the six bits that are required for

        driving the red green and blue color guns of the target monitor When the

        chosen color palette is either Black amp White or 16 Colors these six bits are

        obtained by mapping the value of the active pixel to a predefined RGB code

        When the chosen color palette is 64 Colors the actual pixel value is used

        directly

        The register RGBREG is reset to zero (000000) when the VGA

        Controller receives an external reset signal (RST) This RGB code represents

        black

        The register is updated on each rising edge of the CLK signal and the

        value loaded is dependent on the state of the line display enable signal en

        When en is Low blanking is required and RGBREG is loaded with the code

        for black (000000)

        Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

        and a 4-bit pixel (16 Colors mode) into the required RGB color code

        25

        26

        27

        The RGB color code stored in the RGB register is output from the VGA

        Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

        and B1)

        The monitor itself expects analog signals as inputs to its electron gun control

        circuits This is achieved by using 2-bit digital to analog converter circuitry

        located on the Nano Board itself as shown in Figure 4

        Figure 91 Digital to Analog RGB conversion

        28

        For each color the 2-bit digital signal from the VGA Controller can be

        converted into 4 distinct analog levels These levels specify the intensity of

        each of the three primary colors to use when displaying the pixel on the

        monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

        brightness)

        With each analog input being one of four possible levels the monitor

        can display each pixel on the screen with one of 64 different color

        permutations

        Using Block RAM in Spartan-3E Generation

        FPGAs

        For applications requiring large on-chip memories Spartantrade-3

        Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

        Using various configuration options Select RAM blocks create RAM ROM

        FIFOs large look-up tables data width converterscircular buffers and shift

        registers each supporting various data widths and depths This application note

        describes the features and capabilities of block Select RAM and illustrates how

        to specify the various options using the Xilinx CORE Generatortrade system or

        via VHDL or Verilog instantiation

        29

        Table 2 Block RAM Available in Spartan-3E Devices

        Devi

        ce

        R

        A

        M

        Co

        lu

        m

        ns

        R

        A

        M

        Bl

        oc

        ks

        pe

        r

        C

        ol

        u

        m

        n

        T

        o

        t

        a

        l

        R

        A

        M

        B

        l

        o

        c

        k

        s

        T

        o

        t

        a

        l

        R

        A

        M

        B

        it

        s

        T

        o

        t

        a

        l

        R

        A

        M

        K

        b

        i

        t

        s

        XC3

        S100

        E

        1 4 4 7

        3

        7

        2

        8

        7

        2

        XC3

        S250

        E

        2 6 1

        2

        2

        2

        1

        1

        8

        4

        2

        1

        6

        XC3

        S500

        E

        2 10 2

        0

        3

        6

        8

        6

        4

        3

        6

        0

        30

        0

        XC3

        S120

        0E

        2 14 2

        8

        5

        1

        6

        0

        9

        6

        5

        0

        4

        XC3

        S160

        0E

        2 18 3

        6

        6

        6

        3

        5

        5

        2

        6

        4

        8

        Each block RAM contains 18432 bits of fast static RAM 16K bits of

        which is allocated to data storage and in some memory configurations an

        additional 2K bits allocated to parity or additional plus data bits Physically

        the block RAM memory has two completely independent access ports labeled

        Port A and Port B The structure is fully symmetrical and both ports are

        interchangeable and both ports support data read and write operations Each

        memory port is synchronous with its own clock clock enable and write

        enable Read operations are also synchronous and require a clock edge and

        clock enable Though physically a dual-port memory block RAM simulates

        single-port memory in an application as shown in Figure 1 Furthermore each

        block memory supports multiple configurations or aspect ratios Table 3

        summarizes the essential SelectRAM features Cascade multiple block RAMs

        to create deeper and wider memory organizations with a minimal timing

        penalty incurred through specialized routing resources

        31

        32

        10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

        Why (V) HDL

        Interoperability

        Technology independence

        Design reuse

        Several levels of abstraction

        Readability

        Standard language

        Widely supported

        What is VHDL

        VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

        IC)

        Design specification language

        Design entry language

        Design simulation language

        Design documentation language

        An alternative to schematics

        101 Brief History

        VHDL Was developed in the early 1980s for managing design problems that

        involved large circuits and multiple teams of engineers

        Funded by US Department of Defence

        33

        The first publicly available version was released in 1985

        In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

        presented with a proposal to standardize the VHDL

        In 1987 standardization =gt IEEE 1076-1987

        An improved version of the language was released in 1994 =gt IEEE

        standard1076-1993

        102 Related Standards

        IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

        impedance

        Soon after IEEE 1076-1987 was released simulator companies began using

        their own non-standard types =gt VHDL was becoming a nonstandard

        IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

        definitions for a nine-valued data type std_logic

        IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

        to actual hardware

        Defines eg two numeric types signed and unsigned

        VHDL Environment

        Design Units

        34

        Segments of VHDL code that can be compiled separately and stored in a

        library

        Entities

        A black box with interface definition

        Defines the inputsoutputs of a component (define pins)

        A way to represent modularity in VHDL

        Similar to symbol in schematic

        Entity declaration describes entity

        Eg

        entity Comparator is

        port (A B in std_logic_vector(7 downto0)

        EQ out std_logic)

        end Comparator

        Ports Provide channels of communication between the component and its

        environment

        Each port must have a name direction and a type

        An entity may have NO port declaration

        35

        Port directions In A value of a port can be read inside the component but cannot be assigned

        Multiple reads of port are allowed

        Out Assignments can be made to a port but data from a port cannot be read

        Multiple assignments are allowed

        In out Bi-directional assignments can be made and data can be read Multiple

        assignments are allowed

        Buffer An out port with read capability May have at most one assignment

        (are not recommended)

        Architectures Every entity has at least one architecture

        One entity can have several architectures

        Architectures can describe design using

        BehaviorndashStructurendashDataflow

        Architectures can describe design on many levelsndashGate levelndashRTL (Register

        Transfer Level)ndashBehavioral level

        Configuration declaration links architecture to entity

        Eg

        Architecture Comparator1 of Comparator is

        Begin

        EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

        End Comparator1

        Configurations Links entity declaration and architecture body together

        Concept of default configuration is a bit messy in VHDL lsquo87

        ndashLast architecture analyzed links to entity

        Can be used to change simulation behavior without re-analyzing the VHDL

        source

        Complex configuration declarations are ignored in synthesis

        36

        Some entities can have eggate level architecture and behavioral

        architecture

        Are always optional

        PackagesPackages contain information common to many design units

        1 Package declaration

        --constant declarations

        ndashtype and subtype declarations

        ndashfunction and procedure declarations

        ndashglobal signal declarations

        ndashfile declarations

        ndashcomponent declarations

        2 Package body

        ndashis not necessary needed

        ndashfunction bodies

        ndashprocedure bodies

        Packages are meant for encapsuling data which can be shared globally among

        Several design units

        Consists of declaration part and optional body part

        Package declaration can contain

        ndashtype and subtype declarations

        ndashsubprograms

        ndashconstants

        Alias declarations

        ndashglobal signal declarations

        ndashfile declarations

        ndashcomponent declarations

        37

        Package body consists of

        ndashsubprogram declarations and bodies

        ndashtype and subtype declarations

        ndash deferred constants

        ndash file declarations

        LibrariesCollection of VHDL design units (database)

        1 Packages

        package declaration

        package body

        2 Entities (entity declaration)

        3 Architectures (architecture body)

        4 Configurations (configuration declarations)

        Usually directory in UNIX file system

        Can be also any other kind of database

        Levels of Abstraction

        VHDL supports many possible styles of design description which differ

        primarily in how closely they relate to the HW

        It is possible to describe a circuit in a number of ways

        Structural-------

        Dataflow ------- Higher level of abstraction

        Behavioral -------

        Structural VHDL description

        Circuit is described in terms of its components

        From a low-level description (eg transistor-level description) to a high level

        description (eg block diagram)

        For large circuits a low-level description quickly becomes impractical

        103 Dataflow VHDL Description

        38

        Circuit is described in terms of how data moves through the system

        In the dataflow style you describe how information flows between registers in

        the system

        The combinational logic is described at a relatively high level the placement

        and operation of registers is specified quite precisely

        The behavior of the system over the time is defined by registers

        There are no build-in registers in VHDL-language

        ndashEither lower level description

        ndashor behavioral description of sequential elements is needed

        The lower level register descriptions must be created or obtained

        If there is no 3rd party models for registers =gt you must write the behavioral

        description of registers

        The behavioral description can be provided in the form of

        subprograms(functions or procedures)

        104 Behavioral VHDL Description

        Circuit is described in terms of its operation over time

        39

        Representation might include eg state diagrams timing diagrams and

        algorithmic descriptions

        The concept of time may be expressed precisely using delays (eg A lt= B

        after 10 ns)

        If no actual delay is used order of sequential operations is defined

        In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

        timing specifications

        The actual timing results depend on implementation technology and efficiency

        of synthesis tool

        There are a few tools for behavioral synthesis

        Concurrent Vs Sequential

        Processes

        Basic simulation concept in VHDL

        VHDL description can always be broken up to interconnected processes

        Quite similar to Unix process

        40

        Process keyword in VHDL

        Process statement is concurrent statement

        Statements inside process statements are sequential statements

        Process must contain either sensitivity list or wait statement(s) but NOT both

        Sensitivity list or wait statement(s) contains signals which wakes process up

        General format

        Process [(sensitivity_list)]

        process_declarative_part

        Begin

        process_statements

        [wait_statement]

        End process

        41

        11 VGA CONTROLLER CODE

        library IEEE

        use IEEESTD_LOGIC_1164ALL

        use IEEESTD_LOGIC_ARITHALL

        use IEEESTD_LOGIC_UNSIGNEDALL

        use IEEEnumeric_stdALL

        ---- Uncomment the following library declaration if instantiating

        ---- any Xilinx primitives in this code

        --library UNISIM

        --use UNISIMVComponentsall

        entity vga_controller is port(clk50min std_logic

        rstnin std_logic

        hsyncout std_logic

        vsyncout std_logic

        redout std_logic_vector(0 downto 0)

        greenout std_logic_vector(0 downto 0)

        blueout std_logic_vector(0 downto 0))

        end vga_controller

        architecture Behavioral of vga_controller is

        component memory

        port (

        addr IN std_logic_VECTOR(17 downto 0)

        clk IN std_logic

        dout OUT std_logic_VECTOR(0 downto 0)

        en IN std_logic

        sinit IN std_logic)

        END component

        42

        component sync_unit port(

        clkin std_logic

        rstnin std_logic

        hsyncout std_logic

        ---horizontal synch pulse(row)

        vsyncout std_logic

        --vertical sync pulse(frame) 111110100000000000

        v_en_regout std_logic

        h_enout std_logic

        v_enout std_logic

        line_countinout integer

        c_hsinout integer)

        end component

        component pixelgen port(

        clkin std_logic

        rstnin std_logic

        hsyncin std_logic

        vsyncin std_logic

        h_enin std_logic

        v_enin std_logic

        v_en_reg in std_logic

        data_inin std_logic_vector(0 downto 0)

        line_countin integer

        c_hsin integer

        red out std_logic_vector(0 downto 0)

        greenout std_logic_vector(0 downto 0)

        blueout std_logic_vector(0 downto 0)

        addrout std_logic_vector(17 downto 0)

        rdout std_logic)

        end component

        43

        signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

        signal line_count_sc_hs_saddrinteger

        signal data_sstd_logic_vector(0 downto 0)

        signal addr_sstd_logic_vector(17 downto 0)

        begin

        --addr_s lt= std_logic_vector(to_unsigned(addr18))

        process(clk50mrstn)

        begin

        if(rstn = 0)then

        clk lt= 0

        elsif(clk50m = 1 and clk50mevent)then

        clk lt= not clk

        end if

        --end if

        end process

        hsync lt= hsync_s

        vsync lt= vsync_s

        sync sync_unit port map(

        clk =gt clk

        rstn =gt rstn

        hsync =gt hsync_s

        vsync =gt vsync_s

        v_en_reg =gt v_en_reg_s

        line_count =gt line_count_s

        h_en =gt h_en_s

        v_en =gt v_en_s

        c_hs =gt c_hs_s

        )

        44

        pixel pixelgen port map(

        clk =gt clk

        rstn =gt rstn

        hsync =gt hsync_s

        vsync =gt vsync_s

        h_en =gt h_en_s

        v_en =gt v_en_s

        v_en_reg =gt v_en_reg_s

        data_in =gt data_s

        line_count=gt line_count_s

        c_hs =gt c_hs_s

        red =gt red

        green =gt green

        blue =gt blue

        addr =gt addr_s

        rd =gt rd_s

        )

        mem12 memory port map(

        addr =gt addr_s

        clk =gt clk

        dout =gt data_s

        en =gt rd_s

        sinit=gt rstn

        )

        end Behavioral

        45

        SYNCHRONIZATION BLOCK CODE

        library IEEE

        use IEEESTD_LOGIC_1164ALL

        use IEEESTD_LOGIC_ARITHALL

        use IEEESTD_LOGIC_UNSIGNEDALL

        entity sync_unit is port(

        clkin std_logic

        rstnin std_logic

        hsyncout std_logic ---horizontal synch pulse(row)

        vsyncout std_logic --vertical sync

        pulse(frame) 111110100000000000

        v_en_regout std_logic

        h_enout std_logic

        v_enout std_logic

        line_countout integer

        c_hsout integer )

        end sync_unit

        architecture arch_sync_unit of sync_unit is

        signal h_en_sv_en_sstd_logic

        signal line_count_sc_hs_sinteger

        begin

        c_hs lt= c_hs_s

        line_count lt= line_count_s

        h_en lt= h_en_s

        v_en lt= v_en_s

        -- hsync counter

        process(clkrstn)

        begin

        if(rstn = 0)then

        46

        c_hs_s lt= 0

        else

        if(clk = 1 and clkevent)then

        if(c_hs_s=793)then

        c_hs_s lt= 0

        else

        c_hs_s lt= c_hs_s +1

        end if

        end if

        end if

        end process

        --vsync counter

        --vertical line counter

        process(h_en_srstn)

        begin

        if(rstn = 0)then

        line_count_s lt= 0

        elsif(h_en_s = 1 and h_en_sevent)then

        line_count_s lt= line_count_s + 1

        if (line_count_s=515)then

        line_count_s lt= 0

        end if

        end if

        --end if

        end process

        --hysnc pulse

        process(clkrstn)

        begin

        if(rstn = 0)then

        47

        hsync lt= 1

        elsif(clk = 1 and clkevent)then

        if(c_hs_s lt=95)then

        hsync lt= 0

        else

        hsync lt= 1

        end if

        end if

        --end if

        end process

        process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

        this period)

        begin

        if(rstn = 0)then

        h_en_s lt= 0

        elsif(clk = 1 and clkevent)then

        if((c_hs_s gt=143) and (c_hs_slt=778))then

        h_en_s lt= 1

        else

        h_en_s lt= 0

        end if

        end if

        --end if

        end process

        ---vysnc pulse

        process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

        begin

        if(rstn = 0)then

        48

        v_en_s lt= 0

        elsif(clk = 1 and clkevent)then

        if((line_count_sgt=35 and line_count_slt=515))then

        v_en_s lt= 1

        else

        v_en_s lt= 0

        end if

        end if

        --end if

        end process

        process(clkrstn) --Vertical Sync Front porch Back Porch

        begin

        if(rstn = 0)then

        vsync lt= 1

        elsif(clk = 1 and clkevent)then

        if(line_count_s lt2)then

        vsync lt= 0

        else

        vsync lt= 1

        end if

        end if

        --end if

        end process

        process(clk)

        begin

        if(clk = 1 and clkevent)then

        v_en_reglt=v_en_s

        end if

        end process

        end arch_sync_unit

        49

        PIXEL GENERATOR

        library IEEE

        use IEEESTD_LOGIC_1164ALL

        use IEEESTD_LOGIC_ARITHALL

        use IEEESTD_LOGIC_UNSIGNEDALL

        entity pixelgen is port(

        clkin std_logic

        rstnin std_logic

        hsyncin std_logic

        vsyncin std_logic

        h_enin std_logic

        v_enin std_logic

        v_en_reg in std_logic

        data_inin std_logic_vector(0 downto 0)

        line_countin integer

        c_hsin integer

        red out std_logic_vector(0 downto 0)

        greenout std_logic_vector(0 downto 0)

        blueout std_logic_vector(0 downto 0)

        addrout std_logic_vector(17 downto 0)

        rdout std_logic

        )

        end pixelgen

        architecture behv of pixelgen is

        signal addr_sstd_logic_vector(17 downto 0)

        begin

        addr lt= addr_s

        process(clkrstn)

        begin

        50

        if(rstn = 0)then

        rd lt= 0

        red lt= 0

        green lt= 0

        blue lt= 0

        addr_s lt= x0000 amp 00

        elsif (clk = 1 and clkevent)then

        if((v_en=1) and (v_en_reg=0))then

        addr_slt=(others =gt0)

        end if

        if(v_en=1 and h_en=1)then

        if(line_countlt=514)then

        if(c_hs lt= 655)then

        rd lt= 1

        red lt= data_in

        green lt= data_in

        blue lt= data_in

        addr_s lt=addr_s +000000000000000001

        else

        rd lt= 0

        red lt= 0

        green lt= 1

        blue lt= 0

        end if

        else

        rd lt= 0

        red lt= 1

        green lt= 1

        blue lt= 1

        51

        end if

        if(addr_s=111011111111111111)then

        addr_slt= (others =gt0)

        end if

        else

        red lt= 0

        green lt= 0

        blue lt= 0

        end if

        end if

        -- end if

        --end if

        end process

        end behv

        52

        12 FPGA ARCHITECTURE

        A field-programmable gate array (FPGA) is a semiconductor device that can

        be configured by the customer or designer after manufacturingmdashhence the

        name field-programmable FPGAs are programmed using a logic circuit

        diagram or a source code in a hardware description language (HDL) to specify

        how the chip will work They can be used to implement any logical function

        that an application-specific integrated circuit (ASIC) could perform but the

        ability to update the functionality after shipping offers advantages for many

        applications

        FPGAs contain programmable logic components called logic blocks and a

        hierarchy of reconfigurable interconnects that allow the blocks to be wired

        togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

        can be configured to perform complex combinational functions or merely

        simple logic gates like AND and XOR In most FPGAs the logic blocks also

        include memory elements which may be simple flip-flops or more complete

        blocks of memory

        History The FPGA industry sprouted from programmable read only

        memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

        both had the option of being programmed in batches in a factory or in the field

        (field programmable) however programmable logic was hard-wired between

        logic gates

        Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

        first commercially viable field programmable gate array in 1985 ndash the

        XC2064 The XC2064 had programmable gates and programmable

        interconnects between gates the beginnings of a new technology and market

        The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

        input lookup tables (LUTs) More than 20 years later Freeman was entered

        into the National Inventors Hall of Fame for his invention

        53

        Some of the industryrsquos foundational concepts and technologies for

        programmable logic arrays gates and logic blocks are founded in patents

        awarded to David W Page and LuVerne R Peterson in 1985

        In the late 1980s the Naval Surface Warfare Department funded an experiment

        proposed by Steve Casselman to develop a computer that would implement

        600000 reprogrammable gates Casselman was successful and the system was

        awarded a patent in 1992

        Xilinx continued unchallenged and quickly growing from 1985 to the mid-

        1990s when competitors sprouted up eroding significant market-share By

        1993 Actel was serving about 18 percent of the market

        The 1990s were an explosive period of time for FPGAs both in sophistication

        and the volume of production In the early 1990s FPGAs were primarily used

        in telecommunications and networking By the end of the decade FPGAs

        found their way into consumer automotive and industrial applications

        FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

        algorithm technology and FPGAs to create a sound recognition device

        Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

        to decide the configuration needed to accomplish a sound recognition task

        121 Modern developments

        A recent trend has been to take the coarse-grained architectural

        approach a step further by combining the logic blocks and interconnects of

        traditional FPGAs with embedded microprocessors and related peripherals to

        form a complete system on a programmable chip This work mirrors the

        architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

        Group which combined a reconfigurable CPU architecture on a single chip

        called the SB24 That work was done in 1982 Examples of such hybrid

        technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

        which include one or more PowerPC processors embedded within the FPGAs

        54

        logic fabric The Atmel FPSLIC is another such device which uses an AVR

        processor in combination with Atmels programmable logic architecture

        An alternate approach to using hard-macro processors is to make use of soft

        processor cores that are implemented within the FPGA logic (See Soft

        processors below)

        As previously mentioned many modern FPGAs have the ability to be

        reprogrammed at run time and this is leading to the idea of reconfigurable

        computing or reconfigurable systems mdash CPUs that reconfigure themselves to

        suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

        example of a reconfigurable soft processor implemented on FPGAs However

        it does not support dynamic reconfiguration at runtime but instead adapts itself

        to a specific program

        Additionally new non-FPGA architectures are beginning to emerge

        Software-configurable microprocessors such as the Stretch S5000 adopt a

        hybrid approach by providing an array of processor cores and FPGA-like

        programmable cores on the same chip

        Gates

        1987 9000 gates Xilinx

        1992 600000 Naval Surface Warfare Department

        Early 2000s Millions

        Market size

        1985 First commercial FPGA technology invented by Xilinx

        1987 $14 million

        ~1993 gt$385 million

        2005 $19 billion

        2010 estimates $275 billion

        55

        122 FPGA Comparisons

        Historically FPGAs have been slower less energy efficient and generally

        achieved less functionality than their fixed ASIC counterparts A combination

        of volume fabrication improvements research and development and the IO

        capabilities of new supercomputers have largely closed the performance gap

        between ASICs and FPGAs

        Advantages include a shorter time to market ability to re-program in the field

        to fix bugs and lower non-recurring engineering costs Vendors can also take a

        middle road by developing their hardware on ordinary FPGAs but

        manufacture their final version so it can no longer be modified after the design

        has been committed

        Xilinx claims that several market and technology dynamics are changing the

        ASICFPGA paradigm

        IC costs are rising aggressively

        ASIC complexity has bolstered development time and costs

        RampD resources and headcount is decreasing

        Revenue losses for slow time-to-market are increasing

        Financial constraints in a poor economy are driving low-cost technologies

        These trends make FPGAs a better alternative than ASICs for a growing

        number of higher-volume applications than they have been historically used

        for which the company blames for the growing number of FPGA design starts

        (see History)

        The primary differences between CPLDs and FPGAs are architectural A

        CPLD has a somewhat restrictive structure consisting of one or more

        programmable sum-of-products logic arrays feeding a relatively small number

        of clocked registers The result of this is less flexibility with the advantage of

        more predictable timing delays and a higher logic-to-interconnect ratio The

        FPGA architectures on the other hand are dominated by interconnect This

        56

        makes them far more flexible (in terms of the range of designs that are

        practical for implementation within them) but also far more complex to design

        for

        Another notable difference between CPLDs and FPGAs is the presence in most

        FPGAs of higher-level embedded functions (such as adders and multipliers)

        and embedded memories as well as to have logic blocks implement decoders

        or mathematical functions

        Some FPGAs have the capability of partial re-configuration that lets one

        portion of the device be re-programmed while other portions continue running

        123 Applications

        Applications of FPGAs include digital signal processing software-defined

        radio aerospace and defense systems ASIC prototyping medical imaging

        computer vision speech recognition cryptography bioinformatics computer

        hardware emulation radio astronomy and a growing range of other areas

        FPGAs originally began as competitors to CPLDs and competed in a similar

        space that of glue logic for PCBs As their size capabilities and speed

        increased they began to take over larger and larger functions to the state where

        some are now marketed as full systems on chips (SoC) Particularly with the

        introduction of dedicated multipliers into FPGA architectures in the late 1990s

        applications which had traditionally been the sole reserve of DSPs began to

        incorporate FPGAs instead

        FPGAs especially find applications in any area or algorithm that can make use

        of the massive parallelism offered by their architecture One such area is code

        breaking in particular brute-force attack of cryptographic algorithms

        FPGAs are increasingly used in conventional high performance computing

        applications where computational kernels such as FFT or Convolution are

        performed on the FPGA instead of a microprocessor

        57

        The inherent parallelism of the logic resources on an FPGA allows for

        considerable computational throughput even at a low MHz clock rates The

        flexibility of the FPGA allows for even higher performance by trading off

        precision and range in the number format for an increased number of parallel

        arithmetic units This has driven a new type of processing called

        reconfigurable computing where time intensive tasks are offloaded from

        software to FPGAs

        The adoption of FPGAs in high performance computing is currently limited by

        the complexity of FPGA design compared to conventional software and the

        extremely long turn-around times of current design tools where 4-8 hours wait

        is necessary after even minor changes to the source code

        Traditionally FPGAs have been reserved for specific vertical applications

        where the volume of production is small For these low-volume applications

        the premium that companies pay in hardware costs per unit for a

        programmable chip is more affordable than the development resources spent

        on creating an ASIC for a low-volume application Today new cost and

        performance dynamics have broadened the range of viable applications

        Architecture

        The most common FPGA architecture consists of an array of configurable

        logic blocks (CLBs) IO pads and routing channels Generally all the routing

        channels have the same width (number of wires) Multiple IO pads may fit

        into the height of one row or the width of one column in the array

        An application circuit must be mapped into an FPGA with adequate resources

        While the number of CLBs and IOs required is easily determined from the

        design the number of routing tracks needed may vary considerably even

        among designs with the same amount of logic (For example a crossbar switch

        requires much more routing than a systolic array with the same gate count)

        Since unused routing tracks increase the cost (and decrease the performance)

        of the part without providing any benefit FPGA manufacturers try to provide

        58

        just enough tracks so that most designs that will fit in terms of LUTs and IOs

        can be routed This is determined by estimates such as those derived from

        Rents rule or by experiments with existing designs

        The FPGA is an array or island-style FPGA It consists of an array of logic

        blocks and routing channels Two IO pads fit into the height of one row or the

        width of one column as shown below All the routing channels have the same

        width (number of wires)

        121 FPGA structure

        A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

        flip-flop as shown below In recent years manufacturers have started moving

        to 6-input LUTs in their high performance parts claiming increased

        performance

        Typical logic block

        59

        There is only one output which can be either the registered or the unregistered

        LUT output The logic block has four inputs for the LUT and a clock input

        Since clock signals (and often other high-fanout signals) are normally routed

        via special-purpose dedicated routing networks in commercial FPGAs they

        and other signals are separately managed

        For this example architecture the locations of the FPGA logic block pins are

        shown below

        Logic Block Pin Locations

        Each input is accessible from one side of the logic block while the output pin

        can connect to routing wires in both the channel to the right and the channel

        below the logic block

        Each logic block output pin can connect to any of the wiring segments in the

        channels adjacent to it

        Similarly an IO pad can connect to any one of the wiring segments in the

        channel adjacent to it For example an IO pad at the top of the chip can

        connect to any of the W wires (where W is the channel width) in the horizontal

        channel immediately below it

        Generally the FPGA routing is unsegmented That is each wiring segment

        spans only one logic block before it terminates in a switch box By turning on

        60

        some of the programmable switches within a switch box longer paths can be

        constructed For higher speed interconnect some FPGA architectures use

        longer routing lines that span multiple logic blocks

        Whenever a vertical and a horizontal channel intersect there is a switch box In

        this architecture when a wire enters a switch box there are three

        programmable switches that allow it to connect to three other wires in adjacent

        channel segments The pattern or topology of switches used in this

        architecture is the planar or domain-based switch box topology In this switch

        box topology a wire in track number one connects only to wires in track

        number one in adjacent channel segments wires in track number 2 connect

        only to other wires in track number 2 and so on The figure below illustrates

        the connections in a switch box

        Switch box topology

        Modern FPGA families expand upon the above capabilities to include higher

        level functionality fixed into the silicon Having these common functions

        embedded into the silicon reduces the area required and gives those functions

        increased speed compared to building them from primitives Examples of these

        include multipliers generic DSP blocks embedded processors high speed IO

        logic and embedded memories

        61

        FPGAs are also widely used for systems validation including pre-silicon

        validation post-silicon validation and firmware development This allows chip

        companies to validate their design before the chip is produced in the factory

        reducing the time to market

        124 FPGA Design and Programming

        To define the behavior of the FPGA the user provides a hardware description

        language (HDL) or a schematic design The HDL form might be easier to work

        with when handling large structures because its possible to just specify them

        numerically rather than having to draw every piece by hand On the other hand

        schematic entry can allow for easier visualisation of a design

        Then using an electronic design automation tool a technology-mapped netlist

        is generated The netlist can then be fitted to the actual FPGA architecture

        using a process called place-and-route usually performed by the FPGA

        companys proprietary place-and-route software The user will validate the

        map place and route results via timing analysis simulation and other

        verification methodologies Once the design and validation process is

        complete the binary file generated (also using the FPGA companys

        proprietary software) is used to (re)configure the FPGA

        Going from schematicHDL source files to actual configuration The source

        files are fed to a software suite from the FPGACPLD vendor that through

        different steps will produce a file This file is then transferred to the

        FPGACPLD via a serial interface (JTAG) or to an external memory device

        like an EEPROM

        The most common HDLs are VHDL and Verilog although in an attempt to

        reduce the complexity of designing in HDLs which have been compared to the

        equivalent of assembly languages there are moves to raise the abstraction level

        through the introduction of alternative languages

        62

        To simplify the design of complex systems in FPGAs there exist libraries of

        predefined complex functions and circuits that have been tested and optimized

        to speed up the design process These predefined circuits are commonly called

        IP cores and are available from FPGA vendors and third-party IP suppliers

        (rarely free and typically released under proprietary licenses) Other

        predefined circuits are available from developer communities such as Open

        Cores (typically free and released under the GPL BSD or similar license) and

        other sources

        In a typical design flow an FPGA application developer will simulate the

        design at multiple stages throughout the design process Initially the RTL

        description in VHDL or Verilog is simulated by creating test benches to

        simulate the system and observe results Then after the synthesis engine has

        mapped the design to a netlist the netlist is translated to a gate level

        description where simulation is repeated to confirm the synthesis proceeded

        without errors Finally the design is laid out in the FPGA at which point

        propagation delays can be added and the simulation run again with these

        values back-annotated onto the netlist

        Basic Process Technology Types

        SRAM - based on static memory technology In-system programmable and re-

        programmable Requires external boot devices CMOS

        Antifuse - One-time programmable CMOS

        EPROM - Erasable Programmable Read-Only Memory technology Usually

        one-time programmable in production because of plastic packaging

        Windowed devices can be erased with ultraviolet (UV) light CMOS

        EEPROM - Electrically Erasable Programmable Read-Only Memory

        technology Can be erased even in plastic packages Some but not all

        EEPROM devices can be in-system programmed CMOS

        Flash - Flash-erase EPROM technology Can be erased even in plastic

        packages Some but not all flash devices can be in-system programmed

        63

        Usually a flash cell is smaller than an equivalent EEPROM cell and is

        therefore less expensive to manufacture CMOS

        Fuse - One-time programmable Bipolar

        Major Manufacturers

        Xilinx and Altera are the current FPGA market leaders and long-time industry

        rivals Together they control over 80 percent of the market with Xilinx alone

        representing over 50 percent

        Xilinx also provides free Windows and Linux design software while Altera

        provides free Windows tools the Solaris and Linux tools are only available via

        a rental scheme

        Other competitors include Lattice Semiconductor (flash SRAM) Actel

        (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

        Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

        focused CSSP no general purpose FPGAs)

        125 FPGA prototype

        FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

        prototyping is the method to prototype SoC and ASIC design on FPGA for

        hardware verification and early software development

        Main stream verification methods for hardware design and early software and

        firmware co-design has become mainstream Prototyping SoC and ASIC

        design on FPGA has become a good method to do this

        64

        Reasons why Prototyping is important

        1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

        functionally correct This is compared to designers only relying on software

        simulations to verify that their hardware design is sound Simulation speed and

        modeling accuracy limitations hinder this development

        2 Due to time constrains many projects cannot wait until the silicon is back from

        the foundry to start on software tests FPGA prototyping allows for much more

        time in area of software development and testing at the software-hardware

        integration stage This allows many unforeseen software bugs that appear due

        to todays array of operating systems applications and hardware

        3 Prototyping also allows the developer to ensure that all IP technologies on his

        system work well together off the simulation stage and in actual form

        4 Prototyping has the added advantage as demo platforms to SoC clients

        bringing in interest early This speeds up the overall development cycle and

        allows for more enhancement or improvement to the chip features as it would

        otherwise have been

        65

        13 SIMULATION RESULTS

        66

        GATE LEVEL

        FigVGA CONTROLLER

        67

        GATE LEVEL

        Fig SYNC UNIT

        68

        TECHNOLOGY SCHEMATIC

        69

        14 APPLICATIONS

        Motion Tracking Systems

        Image processing systems

        Displaying Systems

        141 CONCLUTION

        Hence by implementing FPGA taking a processor-generated picture

        (pixilated) from memory space the Controller provides digital RGB values for

        each pixel as well as horizontal and vertical synchronization signals in order

        to correctly display the picture on a connected monitor

        70

        REFERENCES

        1 httpenwikipediaorgwikiVideo_Graphics_Array

        2 A VGA display controller by Eduardo Sanchez

        3 HinnerrdquoVGA Interface and video signal documentsrdquo

        71

        • Pixel
          • 6 SIGNALS IN VGA CONTROLLER
          • Gates
          • Market size
          • 122 FPGA Comparisons

          4 CONTROLLER DESIGN

          For this first we should know how the display device works Consider an

          image below

          640x480

          Original image

          5

          Pixels of image

          Display device also have pixels Display resolution (640480) is a

          standard resolution of display

          Pixel

          This article is about the picture element For other uses see Pixel

          (disambiguation)

          This example shows an image with a portion greatly enlarged in which the

          individual pixels are rendered as little squares and can easily be seen

          6

          42 A photograph of sub-pixel display elements on a laptops LCD screen

          In digital imaging a pixel (or picture element [1]) is a single point in a raster

          image The pixel is the smallest addressable screen element it is the smallest

          unit of picture which can be controlled Each Pixel has its address The address

          of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

          dimensional grid and are often represented using dots or squares Each pixel is

          a sample of an original image where more samples typically provide more-

          accurate representations of the original The intensity of each pixel is variable

          In color image systems a color is typically represented by three or four

          component intensities such as red green and blue or cyan magenta yellow

          and black

          Color refreshing rate

          Color refreshing rate =60Hz

          Color refreshing rate represents the number of frames that are transmitted per

          second

          Minimum number of frames that can be transmitted per second by VGA

          controller is 60

          7

          5 Hardware Description

          51 Block Diagram

          51 VGA Synchronization Unit

          8

          The Synchronization Unit provides the horizontal and vertical

          synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

          display a picture frame within the confines of a monitorrsquos display area

          These synchronization signals are used as control inputs by the

          monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

          electrons emitted by the three primary color electron guns (Red Green Blue)

          left to right and from top to bottom respectively HSYNC provides the start

          and stop times for the horizontal deflection circuit so that a line of pixels is

          correctly drawn across the screen display VSYNC provides the start and stop

          times for the vertical deflection circuit so that the lines of a frame are correctly

          drawn from the top to the bottom of the screen display

          The resolution for the display is defined by the level on the

          RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

          the 800x600 resolution (SVGA) is used

          Although the resolution determines the area of a monitorrsquos screen

          within which an image can be displayed the full extents of the chosen

          resolution do not have to be used The actual extents of the image display area

          on the screen can be controlled by the use of the DISPSIZE_H and

          DISPSIZE_V inputs These inputs determine the total number of pixels to be

          used in a line and the total number of lines in a frame respectively

          52 Blank pixel generation

          The total number of viewable pixels in a line and viewable lines in a

          frame is determined by the display resolution chosen through the

          RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

          the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

          chosen display resolution is used or not the areas of the monitor screen to the

          top bottom left and right of the viewable frame area are blanked by putting

          black pixels at the required line-pixel positions This has the effect of centering

          the image on the screen

          9

          The color generated for a pixel in the Pixel Unit depends on whether

          the particular pixel requires to be blanked or not The Synchronization Unit

          provides a signal to the Pixel Unit for this very reason This is the line display

          enable signal - en The signal is checked on each rising edge of the external

          clock signal (CLK) and is set as follows

          If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

          en = 0 (pixel requires to be blanked ndash set color to be black)

          Else

          en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

          6 SIGNALS IN VGA CONTROLLER

          For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

          timings are

          Parameter Value Unit

          Clock frequency 25175 MHz

          Horizontal pixels 640

          Horizontal sync polarity Negative

          Total time for each line 3177 micros

          Front porch (A) 094 micros

          Sync pulse length (B) 377 micros

          Back porch (C) 189 micros

          Active video (D) 2517 micros

          (Total horizontal sync time 660 micros)

          10

          61 The vertical timings are

          Parameter Value Unit

          Vertical lines 480

          Vertical sync polarity Negative

          Vertical frequency 5994 Hz

          Front porch (E) 035 ms

          Sync pulse length (F) 006 ms

          Back porch (G) 102 ms

          Active video (H) 1525 ms

          (Total vertical sync time 143 ms)

          640 x 400 70 Hz is video mode used for booting most x86 personal

          computers

          640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

          It should be noted that the actual timings vary slightly For example for

          640x480 60fps a 2517 micros active video time with a pixel frequency of

          25174 MHz gives 633 pixels rather than the expected 640 pixels

          62 Generation of the horizontal synchronization signal ndash HSYNC

          11

          The HSYNC signal is High (inactive) after an external reset signal

          (RST) is received by the VGA Controller The signal is updated on each rising

          edge of the external clock signal (CLK)

          The state of the HSYNC signal depends on the value stored in the horizontal

          counter and is driven low when

          HCOUNT ge (PixTotal + BlankingLeft)

          and remains low while

          HCOUNT lt (PixTotal + BlankingLeft + TLSync)

          Vertical (Frame) Period

          63 Generation of the vertical synchronization signal - VSYNC

          The VSYNC signal is High (inactive) after an external reset signal

          (RST) is received by the VGA Controller The signal is updated after every

          line of pixels is completed (ie on each rising edge of the HSYNC signal)

          The state of the VSYNC signal depends on the value stored in the vertical

          counter and is driven low when

          VCOUNT ge (LinTotal + BlankingLeft)

          and remains low while

          VCOUNT lt (LinTotal + BlankingLeft + TFSync)

          Address Counter (MEMADDR)

          The address counter is used to store the position of the next consecutive

          pixel in the frame Its value is passed to the Pixel Unit on the internal bus

          signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

          the next pixel from picture memory

          The counter is reset to zero when the VGA Controller receives an external

          reset signal (RST) The size of the counter depends on the values chosen for

          DISPSIZE_H and DISPSIZE_V as the range is simply

          12

          0 to (PixTotal x LinTotal) - 1

          Taking the maximum number of pixels in a line and lines in a frame for each

          of the supported display resolutions the maximum ranges for the counter are

          bull 640x480 0 to 307199

          bull 800x600 0 to 479999

          The counter has 19-bit resolution

          While the value in the horizontal counter (HCOUNT) is less than the

          total number of viewable pixels in a line (PixTotal the integer value of

          DISPSIZE_H) the counter is incremented on the rising edge of the external

          clock signal (CLK) Pixel addressing within the frame is consecutive When

          the counter reaches the last pixel in a line its incremented value is the first

          pixel in the next line down

          The address counter will continue to be incremented until the value in the

          vertical counter (VCOUNT) is greater than or equal to the total number of

          viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

          point it will be rolled over to zero

          13

          7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

          microcontroller and any VGA-compatible monitor This variant of the

          Controller provides six modes of display depending on the resolution chosen

          (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

          White 16 Colors or 64 Colors)

          71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

          For 1sec = 60 frames

          1 frame = max 160 sec

          1 pixel = 160 sec = 50ns

          307200

          Max Time for 1 pixel = 40ns

          2) To send information for each and every pixel of a frame by means of

          RGB signals

          3) After completely displaying 1 row information the column must be

          refreshed and the row has to be incremented ie done by Hsync signal

          4) After completing 1 frame of information a new frame has to be started

          for this row and column are to be refreshed ie done by Vsync signal

          14

          =gt For giving information for 1 pixel we use 1 clock (for controlling all

          operations)

          1 clock = 1 pixel

          1 clock cycle period = 40ns

          Clock frequency = 1time period =140ns = 25 mhz

          1 clock frequency = 25 mhz

          So if we use 25mhz clock and display 1 pixel information for each and enery

          clock then we use can attain min 60hz refreshing rate

          do not use the clock frequency less than 25mhz

          Ex

          15MHz 25MHz

          30MHz

          Time period = more effectively utilized frequency

          Less

          No of framessec = less

          More

          Decreases

          Increases

          (It is unable to display 60framessec)

          Signals used for image display-

          R (Red)

          15

          G (Green)

          B (Blue)

          HSYNC (Horizontal Synchronous signal)

          VSYNC (Vertical Synchronous signal)

          All the above signals are used to control the image data to be correctly

          displayed on the VGA display monitor

          72 Timings for front port sync pulse back port-

          Horizontal sync Vertical sync

          Front port 094micros Front port 035ms

          Sync pulse 377 micros Sync pulse 0063ms

          Back port 189 micros Back port 102ms

          16

          8 FUNCTIONAL DESCRIPTION81 Symbol

          VGA Controller Symbol

          Pin description

          The pin description is shown in the below table

          Table VGA Pin description

          Name Type Polarity Bus size Description

          Global Control Signals

          CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

          17

          640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

          RST I High Global system reset

          VGA Input Settings Signals

          RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

          CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

          DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

          DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

          Data Memory Control Signals

          RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

          18

          the internal line enable signal en generated by the Synchronization Unit of the Controller

          DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

          ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

          VGA Monitor Control Signals

          HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

          VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

          19

          VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

          R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

          G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

          S

          B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

          20

          9 VGA SIGNAL TIMING

          Figure 3 summarizes the signal timing involved in sending a line of

          pixels and a frame of lines The actual time values differ according to the

          resolution selected (640x480 or 800x600) the processor-defined values for

          DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

          (CLK)

          21

          91 VGA PIXEL UNIT

          The Pixel Unit provides access to the pixilated image stored in external

          picture memory reading in data a byte at a time and formatting each pixel to

          be displayed For each pixel the 6-bit RGB value required for the control of

          the monitorrsquos three primary color electron guns is generated so that the pixel

          is displayed on the screen with the correct color

          Accessing the stored image

          The image to be displayed on the monitor screen is written by the host

          microcontroller into external memory space (RAM) This memory space can

          be located anywhere (eg using a block of RAM within the FPGA design or

          using the SRAM on the NanoBoard itself)

          Pixel data is stored in the memory space in bytes The number of pixels in a

          byte depends on which color palette is being used for the image (selected by

          the CMOD input)

          bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

          bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

          bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

          (DATA[76] are not used)

          22

          The size of memory required to store a picture is determined by the total

          number of viewable pixels in a line (determined by DISPSIZE_H) the total

          number of viewable lines in a frame (determined by DISPSIZE_V) and the

          number of pixels stored in each byte in memory space

          Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

          The address in RAM where the next pixel is stored is determined using

          an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

          reflects the current contents of the MEMADDR register The exact addressing

          is described below

          92 Black amp White mode

          The picture memory address ndash the byte of data containing the next 8 pixels

          ndash is determined by using bits 183 of hvcnt and right shifting the contents by

          three

          ADDR_PIXEL = 00 amp hvcnt[183]

          16 Colors mode

          The picture memory address ndash the byte of data containing the next 2 pixels

          ndash is determined by using bits 181 of hvcnt and right shifting the contents by

          one

          ADDR_PIXEL = 0 amp hvcnt[181]

          64 Colors mode

          The picture memory address ndash the byte of data containing the next pixel ndash is

          determined by using the full value of hvcnt

          ADDR_PIXEL = hvcnt[180]

          The Pixel register (PIXREG)

          The Pixel register is used to receive the byte of pixel data read from the

          current address in memory space The register PIXREG is reset to zero when

          the VGA Controller receives an external reset signal (RST)

          23

          The register is updated on each rising edge of the CLK signal Data can be

          read from the memory space as long as the RD signal is active (High) The RD

          signal is itself controlled by the external line display enable signal enex This

          internally generated signal is defined as follows

          If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

          (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

          Else

          enex = 1

          When enex is Low read access from memory is disabled (RD = 0)

          The point at which data is loaded from memory into PIXREG depends on the

          particular color palette that is chosen ndash Black amp White 16 Colors or 64

          Colors

          Black amp White mode

          The next byte of data will be loaded into the register whenever the

          lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

          are ldquo000rdquo

          For the currently loaded byte the active pixel is always in the lowest bit

          position of the Pixel register Each pixel in the data byte is moved into this

          active pixel position by shifting the contents of the register right by one bit on

          each rising edge of CLK

          16 Colors mode

          The next byte of data will be loaded into the register whenever the

          lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

          0

          For the currently loaded byte the active pixel is always in the low order

          nibble of the Pixel register Remember that in this mode each byte of data

          contains two pixels The second pixel is moved into this active pixel position

          by shifting the contents of the register right by four bits on the rising edge of

          CLK

          24

          64 Colors mode

          The next byte of data will be loaded into the register on the rising edge

          of the external system clock signal (CLK) In this mode the read of pixel data

          does not depend on the status of the hvcnt signal received from the

          Synchronization Unit

          93 The RGB register (RGBREG)

          The RGB register is used to store the six bits that are required for

          driving the red green and blue color guns of the target monitor When the

          chosen color palette is either Black amp White or 16 Colors these six bits are

          obtained by mapping the value of the active pixel to a predefined RGB code

          When the chosen color palette is 64 Colors the actual pixel value is used

          directly

          The register RGBREG is reset to zero (000000) when the VGA

          Controller receives an external reset signal (RST) This RGB code represents

          black

          The register is updated on each rising edge of the CLK signal and the

          value loaded is dependent on the state of the line display enable signal en

          When en is Low blanking is required and RGBREG is loaded with the code

          for black (000000)

          Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

          and a 4-bit pixel (16 Colors mode) into the required RGB color code

          25

          26

          27

          The RGB color code stored in the RGB register is output from the VGA

          Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

          and B1)

          The monitor itself expects analog signals as inputs to its electron gun control

          circuits This is achieved by using 2-bit digital to analog converter circuitry

          located on the Nano Board itself as shown in Figure 4

          Figure 91 Digital to Analog RGB conversion

          28

          For each color the 2-bit digital signal from the VGA Controller can be

          converted into 4 distinct analog levels These levels specify the intensity of

          each of the three primary colors to use when displaying the pixel on the

          monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

          brightness)

          With each analog input being one of four possible levels the monitor

          can display each pixel on the screen with one of 64 different color

          permutations

          Using Block RAM in Spartan-3E Generation

          FPGAs

          For applications requiring large on-chip memories Spartantrade-3

          Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

          Using various configuration options Select RAM blocks create RAM ROM

          FIFOs large look-up tables data width converterscircular buffers and shift

          registers each supporting various data widths and depths This application note

          describes the features and capabilities of block Select RAM and illustrates how

          to specify the various options using the Xilinx CORE Generatortrade system or

          via VHDL or Verilog instantiation

          29

          Table 2 Block RAM Available in Spartan-3E Devices

          Devi

          ce

          R

          A

          M

          Co

          lu

          m

          ns

          R

          A

          M

          Bl

          oc

          ks

          pe

          r

          C

          ol

          u

          m

          n

          T

          o

          t

          a

          l

          R

          A

          M

          B

          l

          o

          c

          k

          s

          T

          o

          t

          a

          l

          R

          A

          M

          B

          it

          s

          T

          o

          t

          a

          l

          R

          A

          M

          K

          b

          i

          t

          s

          XC3

          S100

          E

          1 4 4 7

          3

          7

          2

          8

          7

          2

          XC3

          S250

          E

          2 6 1

          2

          2

          2

          1

          1

          8

          4

          2

          1

          6

          XC3

          S500

          E

          2 10 2

          0

          3

          6

          8

          6

          4

          3

          6

          0

          30

          0

          XC3

          S120

          0E

          2 14 2

          8

          5

          1

          6

          0

          9

          6

          5

          0

          4

          XC3

          S160

          0E

          2 18 3

          6

          6

          6

          3

          5

          5

          2

          6

          4

          8

          Each block RAM contains 18432 bits of fast static RAM 16K bits of

          which is allocated to data storage and in some memory configurations an

          additional 2K bits allocated to parity or additional plus data bits Physically

          the block RAM memory has two completely independent access ports labeled

          Port A and Port B The structure is fully symmetrical and both ports are

          interchangeable and both ports support data read and write operations Each

          memory port is synchronous with its own clock clock enable and write

          enable Read operations are also synchronous and require a clock edge and

          clock enable Though physically a dual-port memory block RAM simulates

          single-port memory in an application as shown in Figure 1 Furthermore each

          block memory supports multiple configurations or aspect ratios Table 3

          summarizes the essential SelectRAM features Cascade multiple block RAMs

          to create deeper and wider memory organizations with a minimal timing

          penalty incurred through specialized routing resources

          31

          32

          10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

          Why (V) HDL

          Interoperability

          Technology independence

          Design reuse

          Several levels of abstraction

          Readability

          Standard language

          Widely supported

          What is VHDL

          VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

          IC)

          Design specification language

          Design entry language

          Design simulation language

          Design documentation language

          An alternative to schematics

          101 Brief History

          VHDL Was developed in the early 1980s for managing design problems that

          involved large circuits and multiple teams of engineers

          Funded by US Department of Defence

          33

          The first publicly available version was released in 1985

          In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

          presented with a proposal to standardize the VHDL

          In 1987 standardization =gt IEEE 1076-1987

          An improved version of the language was released in 1994 =gt IEEE

          standard1076-1993

          102 Related Standards

          IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

          impedance

          Soon after IEEE 1076-1987 was released simulator companies began using

          their own non-standard types =gt VHDL was becoming a nonstandard

          IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

          definitions for a nine-valued data type std_logic

          IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

          to actual hardware

          Defines eg two numeric types signed and unsigned

          VHDL Environment

          Design Units

          34

          Segments of VHDL code that can be compiled separately and stored in a

          library

          Entities

          A black box with interface definition

          Defines the inputsoutputs of a component (define pins)

          A way to represent modularity in VHDL

          Similar to symbol in schematic

          Entity declaration describes entity

          Eg

          entity Comparator is

          port (A B in std_logic_vector(7 downto0)

          EQ out std_logic)

          end Comparator

          Ports Provide channels of communication between the component and its

          environment

          Each port must have a name direction and a type

          An entity may have NO port declaration

          35

          Port directions In A value of a port can be read inside the component but cannot be assigned

          Multiple reads of port are allowed

          Out Assignments can be made to a port but data from a port cannot be read

          Multiple assignments are allowed

          In out Bi-directional assignments can be made and data can be read Multiple

          assignments are allowed

          Buffer An out port with read capability May have at most one assignment

          (are not recommended)

          Architectures Every entity has at least one architecture

          One entity can have several architectures

          Architectures can describe design using

          BehaviorndashStructurendashDataflow

          Architectures can describe design on many levelsndashGate levelndashRTL (Register

          Transfer Level)ndashBehavioral level

          Configuration declaration links architecture to entity

          Eg

          Architecture Comparator1 of Comparator is

          Begin

          EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

          End Comparator1

          Configurations Links entity declaration and architecture body together

          Concept of default configuration is a bit messy in VHDL lsquo87

          ndashLast architecture analyzed links to entity

          Can be used to change simulation behavior without re-analyzing the VHDL

          source

          Complex configuration declarations are ignored in synthesis

          36

          Some entities can have eggate level architecture and behavioral

          architecture

          Are always optional

          PackagesPackages contain information common to many design units

          1 Package declaration

          --constant declarations

          ndashtype and subtype declarations

          ndashfunction and procedure declarations

          ndashglobal signal declarations

          ndashfile declarations

          ndashcomponent declarations

          2 Package body

          ndashis not necessary needed

          ndashfunction bodies

          ndashprocedure bodies

          Packages are meant for encapsuling data which can be shared globally among

          Several design units

          Consists of declaration part and optional body part

          Package declaration can contain

          ndashtype and subtype declarations

          ndashsubprograms

          ndashconstants

          Alias declarations

          ndashglobal signal declarations

          ndashfile declarations

          ndashcomponent declarations

          37

          Package body consists of

          ndashsubprogram declarations and bodies

          ndashtype and subtype declarations

          ndash deferred constants

          ndash file declarations

          LibrariesCollection of VHDL design units (database)

          1 Packages

          package declaration

          package body

          2 Entities (entity declaration)

          3 Architectures (architecture body)

          4 Configurations (configuration declarations)

          Usually directory in UNIX file system

          Can be also any other kind of database

          Levels of Abstraction

          VHDL supports many possible styles of design description which differ

          primarily in how closely they relate to the HW

          It is possible to describe a circuit in a number of ways

          Structural-------

          Dataflow ------- Higher level of abstraction

          Behavioral -------

          Structural VHDL description

          Circuit is described in terms of its components

          From a low-level description (eg transistor-level description) to a high level

          description (eg block diagram)

          For large circuits a low-level description quickly becomes impractical

          103 Dataflow VHDL Description

          38

          Circuit is described in terms of how data moves through the system

          In the dataflow style you describe how information flows between registers in

          the system

          The combinational logic is described at a relatively high level the placement

          and operation of registers is specified quite precisely

          The behavior of the system over the time is defined by registers

          There are no build-in registers in VHDL-language

          ndashEither lower level description

          ndashor behavioral description of sequential elements is needed

          The lower level register descriptions must be created or obtained

          If there is no 3rd party models for registers =gt you must write the behavioral

          description of registers

          The behavioral description can be provided in the form of

          subprograms(functions or procedures)

          104 Behavioral VHDL Description

          Circuit is described in terms of its operation over time

          39

          Representation might include eg state diagrams timing diagrams and

          algorithmic descriptions

          The concept of time may be expressed precisely using delays (eg A lt= B

          after 10 ns)

          If no actual delay is used order of sequential operations is defined

          In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

          timing specifications

          The actual timing results depend on implementation technology and efficiency

          of synthesis tool

          There are a few tools for behavioral synthesis

          Concurrent Vs Sequential

          Processes

          Basic simulation concept in VHDL

          VHDL description can always be broken up to interconnected processes

          Quite similar to Unix process

          40

          Process keyword in VHDL

          Process statement is concurrent statement

          Statements inside process statements are sequential statements

          Process must contain either sensitivity list or wait statement(s) but NOT both

          Sensitivity list or wait statement(s) contains signals which wakes process up

          General format

          Process [(sensitivity_list)]

          process_declarative_part

          Begin

          process_statements

          [wait_statement]

          End process

          41

          11 VGA CONTROLLER CODE

          library IEEE

          use IEEESTD_LOGIC_1164ALL

          use IEEESTD_LOGIC_ARITHALL

          use IEEESTD_LOGIC_UNSIGNEDALL

          use IEEEnumeric_stdALL

          ---- Uncomment the following library declaration if instantiating

          ---- any Xilinx primitives in this code

          --library UNISIM

          --use UNISIMVComponentsall

          entity vga_controller is port(clk50min std_logic

          rstnin std_logic

          hsyncout std_logic

          vsyncout std_logic

          redout std_logic_vector(0 downto 0)

          greenout std_logic_vector(0 downto 0)

          blueout std_logic_vector(0 downto 0))

          end vga_controller

          architecture Behavioral of vga_controller is

          component memory

          port (

          addr IN std_logic_VECTOR(17 downto 0)

          clk IN std_logic

          dout OUT std_logic_VECTOR(0 downto 0)

          en IN std_logic

          sinit IN std_logic)

          END component

          42

          component sync_unit port(

          clkin std_logic

          rstnin std_logic

          hsyncout std_logic

          ---horizontal synch pulse(row)

          vsyncout std_logic

          --vertical sync pulse(frame) 111110100000000000

          v_en_regout std_logic

          h_enout std_logic

          v_enout std_logic

          line_countinout integer

          c_hsinout integer)

          end component

          component pixelgen port(

          clkin std_logic

          rstnin std_logic

          hsyncin std_logic

          vsyncin std_logic

          h_enin std_logic

          v_enin std_logic

          v_en_reg in std_logic

          data_inin std_logic_vector(0 downto 0)

          line_countin integer

          c_hsin integer

          red out std_logic_vector(0 downto 0)

          greenout std_logic_vector(0 downto 0)

          blueout std_logic_vector(0 downto 0)

          addrout std_logic_vector(17 downto 0)

          rdout std_logic)

          end component

          43

          signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

          signal line_count_sc_hs_saddrinteger

          signal data_sstd_logic_vector(0 downto 0)

          signal addr_sstd_logic_vector(17 downto 0)

          begin

          --addr_s lt= std_logic_vector(to_unsigned(addr18))

          process(clk50mrstn)

          begin

          if(rstn = 0)then

          clk lt= 0

          elsif(clk50m = 1 and clk50mevent)then

          clk lt= not clk

          end if

          --end if

          end process

          hsync lt= hsync_s

          vsync lt= vsync_s

          sync sync_unit port map(

          clk =gt clk

          rstn =gt rstn

          hsync =gt hsync_s

          vsync =gt vsync_s

          v_en_reg =gt v_en_reg_s

          line_count =gt line_count_s

          h_en =gt h_en_s

          v_en =gt v_en_s

          c_hs =gt c_hs_s

          )

          44

          pixel pixelgen port map(

          clk =gt clk

          rstn =gt rstn

          hsync =gt hsync_s

          vsync =gt vsync_s

          h_en =gt h_en_s

          v_en =gt v_en_s

          v_en_reg =gt v_en_reg_s

          data_in =gt data_s

          line_count=gt line_count_s

          c_hs =gt c_hs_s

          red =gt red

          green =gt green

          blue =gt blue

          addr =gt addr_s

          rd =gt rd_s

          )

          mem12 memory port map(

          addr =gt addr_s

          clk =gt clk

          dout =gt data_s

          en =gt rd_s

          sinit=gt rstn

          )

          end Behavioral

          45

          SYNCHRONIZATION BLOCK CODE

          library IEEE

          use IEEESTD_LOGIC_1164ALL

          use IEEESTD_LOGIC_ARITHALL

          use IEEESTD_LOGIC_UNSIGNEDALL

          entity sync_unit is port(

          clkin std_logic

          rstnin std_logic

          hsyncout std_logic ---horizontal synch pulse(row)

          vsyncout std_logic --vertical sync

          pulse(frame) 111110100000000000

          v_en_regout std_logic

          h_enout std_logic

          v_enout std_logic

          line_countout integer

          c_hsout integer )

          end sync_unit

          architecture arch_sync_unit of sync_unit is

          signal h_en_sv_en_sstd_logic

          signal line_count_sc_hs_sinteger

          begin

          c_hs lt= c_hs_s

          line_count lt= line_count_s

          h_en lt= h_en_s

          v_en lt= v_en_s

          -- hsync counter

          process(clkrstn)

          begin

          if(rstn = 0)then

          46

          c_hs_s lt= 0

          else

          if(clk = 1 and clkevent)then

          if(c_hs_s=793)then

          c_hs_s lt= 0

          else

          c_hs_s lt= c_hs_s +1

          end if

          end if

          end if

          end process

          --vsync counter

          --vertical line counter

          process(h_en_srstn)

          begin

          if(rstn = 0)then

          line_count_s lt= 0

          elsif(h_en_s = 1 and h_en_sevent)then

          line_count_s lt= line_count_s + 1

          if (line_count_s=515)then

          line_count_s lt= 0

          end if

          end if

          --end if

          end process

          --hysnc pulse

          process(clkrstn)

          begin

          if(rstn = 0)then

          47

          hsync lt= 1

          elsif(clk = 1 and clkevent)then

          if(c_hs_s lt=95)then

          hsync lt= 0

          else

          hsync lt= 1

          end if

          end if

          --end if

          end process

          process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

          this period)

          begin

          if(rstn = 0)then

          h_en_s lt= 0

          elsif(clk = 1 and clkevent)then

          if((c_hs_s gt=143) and (c_hs_slt=778))then

          h_en_s lt= 1

          else

          h_en_s lt= 0

          end if

          end if

          --end if

          end process

          ---vysnc pulse

          process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

          begin

          if(rstn = 0)then

          48

          v_en_s lt= 0

          elsif(clk = 1 and clkevent)then

          if((line_count_sgt=35 and line_count_slt=515))then

          v_en_s lt= 1

          else

          v_en_s lt= 0

          end if

          end if

          --end if

          end process

          process(clkrstn) --Vertical Sync Front porch Back Porch

          begin

          if(rstn = 0)then

          vsync lt= 1

          elsif(clk = 1 and clkevent)then

          if(line_count_s lt2)then

          vsync lt= 0

          else

          vsync lt= 1

          end if

          end if

          --end if

          end process

          process(clk)

          begin

          if(clk = 1 and clkevent)then

          v_en_reglt=v_en_s

          end if

          end process

          end arch_sync_unit

          49

          PIXEL GENERATOR

          library IEEE

          use IEEESTD_LOGIC_1164ALL

          use IEEESTD_LOGIC_ARITHALL

          use IEEESTD_LOGIC_UNSIGNEDALL

          entity pixelgen is port(

          clkin std_logic

          rstnin std_logic

          hsyncin std_logic

          vsyncin std_logic

          h_enin std_logic

          v_enin std_logic

          v_en_reg in std_logic

          data_inin std_logic_vector(0 downto 0)

          line_countin integer

          c_hsin integer

          red out std_logic_vector(0 downto 0)

          greenout std_logic_vector(0 downto 0)

          blueout std_logic_vector(0 downto 0)

          addrout std_logic_vector(17 downto 0)

          rdout std_logic

          )

          end pixelgen

          architecture behv of pixelgen is

          signal addr_sstd_logic_vector(17 downto 0)

          begin

          addr lt= addr_s

          process(clkrstn)

          begin

          50

          if(rstn = 0)then

          rd lt= 0

          red lt= 0

          green lt= 0

          blue lt= 0

          addr_s lt= x0000 amp 00

          elsif (clk = 1 and clkevent)then

          if((v_en=1) and (v_en_reg=0))then

          addr_slt=(others =gt0)

          end if

          if(v_en=1 and h_en=1)then

          if(line_countlt=514)then

          if(c_hs lt= 655)then

          rd lt= 1

          red lt= data_in

          green lt= data_in

          blue lt= data_in

          addr_s lt=addr_s +000000000000000001

          else

          rd lt= 0

          red lt= 0

          green lt= 1

          blue lt= 0

          end if

          else

          rd lt= 0

          red lt= 1

          green lt= 1

          blue lt= 1

          51

          end if

          if(addr_s=111011111111111111)then

          addr_slt= (others =gt0)

          end if

          else

          red lt= 0

          green lt= 0

          blue lt= 0

          end if

          end if

          -- end if

          --end if

          end process

          end behv

          52

          12 FPGA ARCHITECTURE

          A field-programmable gate array (FPGA) is a semiconductor device that can

          be configured by the customer or designer after manufacturingmdashhence the

          name field-programmable FPGAs are programmed using a logic circuit

          diagram or a source code in a hardware description language (HDL) to specify

          how the chip will work They can be used to implement any logical function

          that an application-specific integrated circuit (ASIC) could perform but the

          ability to update the functionality after shipping offers advantages for many

          applications

          FPGAs contain programmable logic components called logic blocks and a

          hierarchy of reconfigurable interconnects that allow the blocks to be wired

          togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

          can be configured to perform complex combinational functions or merely

          simple logic gates like AND and XOR In most FPGAs the logic blocks also

          include memory elements which may be simple flip-flops or more complete

          blocks of memory

          History The FPGA industry sprouted from programmable read only

          memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

          both had the option of being programmed in batches in a factory or in the field

          (field programmable) however programmable logic was hard-wired between

          logic gates

          Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

          first commercially viable field programmable gate array in 1985 ndash the

          XC2064 The XC2064 had programmable gates and programmable

          interconnects between gates the beginnings of a new technology and market

          The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

          input lookup tables (LUTs) More than 20 years later Freeman was entered

          into the National Inventors Hall of Fame for his invention

          53

          Some of the industryrsquos foundational concepts and technologies for

          programmable logic arrays gates and logic blocks are founded in patents

          awarded to David W Page and LuVerne R Peterson in 1985

          In the late 1980s the Naval Surface Warfare Department funded an experiment

          proposed by Steve Casselman to develop a computer that would implement

          600000 reprogrammable gates Casselman was successful and the system was

          awarded a patent in 1992

          Xilinx continued unchallenged and quickly growing from 1985 to the mid-

          1990s when competitors sprouted up eroding significant market-share By

          1993 Actel was serving about 18 percent of the market

          The 1990s were an explosive period of time for FPGAs both in sophistication

          and the volume of production In the early 1990s FPGAs were primarily used

          in telecommunications and networking By the end of the decade FPGAs

          found their way into consumer automotive and industrial applications

          FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

          algorithm technology and FPGAs to create a sound recognition device

          Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

          to decide the configuration needed to accomplish a sound recognition task

          121 Modern developments

          A recent trend has been to take the coarse-grained architectural

          approach a step further by combining the logic blocks and interconnects of

          traditional FPGAs with embedded microprocessors and related peripherals to

          form a complete system on a programmable chip This work mirrors the

          architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

          Group which combined a reconfigurable CPU architecture on a single chip

          called the SB24 That work was done in 1982 Examples of such hybrid

          technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

          which include one or more PowerPC processors embedded within the FPGAs

          54

          logic fabric The Atmel FPSLIC is another such device which uses an AVR

          processor in combination with Atmels programmable logic architecture

          An alternate approach to using hard-macro processors is to make use of soft

          processor cores that are implemented within the FPGA logic (See Soft

          processors below)

          As previously mentioned many modern FPGAs have the ability to be

          reprogrammed at run time and this is leading to the idea of reconfigurable

          computing or reconfigurable systems mdash CPUs that reconfigure themselves to

          suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

          example of a reconfigurable soft processor implemented on FPGAs However

          it does not support dynamic reconfiguration at runtime but instead adapts itself

          to a specific program

          Additionally new non-FPGA architectures are beginning to emerge

          Software-configurable microprocessors such as the Stretch S5000 adopt a

          hybrid approach by providing an array of processor cores and FPGA-like

          programmable cores on the same chip

          Gates

          1987 9000 gates Xilinx

          1992 600000 Naval Surface Warfare Department

          Early 2000s Millions

          Market size

          1985 First commercial FPGA technology invented by Xilinx

          1987 $14 million

          ~1993 gt$385 million

          2005 $19 billion

          2010 estimates $275 billion

          55

          122 FPGA Comparisons

          Historically FPGAs have been slower less energy efficient and generally

          achieved less functionality than their fixed ASIC counterparts A combination

          of volume fabrication improvements research and development and the IO

          capabilities of new supercomputers have largely closed the performance gap

          between ASICs and FPGAs

          Advantages include a shorter time to market ability to re-program in the field

          to fix bugs and lower non-recurring engineering costs Vendors can also take a

          middle road by developing their hardware on ordinary FPGAs but

          manufacture their final version so it can no longer be modified after the design

          has been committed

          Xilinx claims that several market and technology dynamics are changing the

          ASICFPGA paradigm

          IC costs are rising aggressively

          ASIC complexity has bolstered development time and costs

          RampD resources and headcount is decreasing

          Revenue losses for slow time-to-market are increasing

          Financial constraints in a poor economy are driving low-cost technologies

          These trends make FPGAs a better alternative than ASICs for a growing

          number of higher-volume applications than they have been historically used

          for which the company blames for the growing number of FPGA design starts

          (see History)

          The primary differences between CPLDs and FPGAs are architectural A

          CPLD has a somewhat restrictive structure consisting of one or more

          programmable sum-of-products logic arrays feeding a relatively small number

          of clocked registers The result of this is less flexibility with the advantage of

          more predictable timing delays and a higher logic-to-interconnect ratio The

          FPGA architectures on the other hand are dominated by interconnect This

          56

          makes them far more flexible (in terms of the range of designs that are

          practical for implementation within them) but also far more complex to design

          for

          Another notable difference between CPLDs and FPGAs is the presence in most

          FPGAs of higher-level embedded functions (such as adders and multipliers)

          and embedded memories as well as to have logic blocks implement decoders

          or mathematical functions

          Some FPGAs have the capability of partial re-configuration that lets one

          portion of the device be re-programmed while other portions continue running

          123 Applications

          Applications of FPGAs include digital signal processing software-defined

          radio aerospace and defense systems ASIC prototyping medical imaging

          computer vision speech recognition cryptography bioinformatics computer

          hardware emulation radio astronomy and a growing range of other areas

          FPGAs originally began as competitors to CPLDs and competed in a similar

          space that of glue logic for PCBs As their size capabilities and speed

          increased they began to take over larger and larger functions to the state where

          some are now marketed as full systems on chips (SoC) Particularly with the

          introduction of dedicated multipliers into FPGA architectures in the late 1990s

          applications which had traditionally been the sole reserve of DSPs began to

          incorporate FPGAs instead

          FPGAs especially find applications in any area or algorithm that can make use

          of the massive parallelism offered by their architecture One such area is code

          breaking in particular brute-force attack of cryptographic algorithms

          FPGAs are increasingly used in conventional high performance computing

          applications where computational kernels such as FFT or Convolution are

          performed on the FPGA instead of a microprocessor

          57

          The inherent parallelism of the logic resources on an FPGA allows for

          considerable computational throughput even at a low MHz clock rates The

          flexibility of the FPGA allows for even higher performance by trading off

          precision and range in the number format for an increased number of parallel

          arithmetic units This has driven a new type of processing called

          reconfigurable computing where time intensive tasks are offloaded from

          software to FPGAs

          The adoption of FPGAs in high performance computing is currently limited by

          the complexity of FPGA design compared to conventional software and the

          extremely long turn-around times of current design tools where 4-8 hours wait

          is necessary after even minor changes to the source code

          Traditionally FPGAs have been reserved for specific vertical applications

          where the volume of production is small For these low-volume applications

          the premium that companies pay in hardware costs per unit for a

          programmable chip is more affordable than the development resources spent

          on creating an ASIC for a low-volume application Today new cost and

          performance dynamics have broadened the range of viable applications

          Architecture

          The most common FPGA architecture consists of an array of configurable

          logic blocks (CLBs) IO pads and routing channels Generally all the routing

          channels have the same width (number of wires) Multiple IO pads may fit

          into the height of one row or the width of one column in the array

          An application circuit must be mapped into an FPGA with adequate resources

          While the number of CLBs and IOs required is easily determined from the

          design the number of routing tracks needed may vary considerably even

          among designs with the same amount of logic (For example a crossbar switch

          requires much more routing than a systolic array with the same gate count)

          Since unused routing tracks increase the cost (and decrease the performance)

          of the part without providing any benefit FPGA manufacturers try to provide

          58

          just enough tracks so that most designs that will fit in terms of LUTs and IOs

          can be routed This is determined by estimates such as those derived from

          Rents rule or by experiments with existing designs

          The FPGA is an array or island-style FPGA It consists of an array of logic

          blocks and routing channels Two IO pads fit into the height of one row or the

          width of one column as shown below All the routing channels have the same

          width (number of wires)

          121 FPGA structure

          A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

          flip-flop as shown below In recent years manufacturers have started moving

          to 6-input LUTs in their high performance parts claiming increased

          performance

          Typical logic block

          59

          There is only one output which can be either the registered or the unregistered

          LUT output The logic block has four inputs for the LUT and a clock input

          Since clock signals (and often other high-fanout signals) are normally routed

          via special-purpose dedicated routing networks in commercial FPGAs they

          and other signals are separately managed

          For this example architecture the locations of the FPGA logic block pins are

          shown below

          Logic Block Pin Locations

          Each input is accessible from one side of the logic block while the output pin

          can connect to routing wires in both the channel to the right and the channel

          below the logic block

          Each logic block output pin can connect to any of the wiring segments in the

          channels adjacent to it

          Similarly an IO pad can connect to any one of the wiring segments in the

          channel adjacent to it For example an IO pad at the top of the chip can

          connect to any of the W wires (where W is the channel width) in the horizontal

          channel immediately below it

          Generally the FPGA routing is unsegmented That is each wiring segment

          spans only one logic block before it terminates in a switch box By turning on

          60

          some of the programmable switches within a switch box longer paths can be

          constructed For higher speed interconnect some FPGA architectures use

          longer routing lines that span multiple logic blocks

          Whenever a vertical and a horizontal channel intersect there is a switch box In

          this architecture when a wire enters a switch box there are three

          programmable switches that allow it to connect to three other wires in adjacent

          channel segments The pattern or topology of switches used in this

          architecture is the planar or domain-based switch box topology In this switch

          box topology a wire in track number one connects only to wires in track

          number one in adjacent channel segments wires in track number 2 connect

          only to other wires in track number 2 and so on The figure below illustrates

          the connections in a switch box

          Switch box topology

          Modern FPGA families expand upon the above capabilities to include higher

          level functionality fixed into the silicon Having these common functions

          embedded into the silicon reduces the area required and gives those functions

          increased speed compared to building them from primitives Examples of these

          include multipliers generic DSP blocks embedded processors high speed IO

          logic and embedded memories

          61

          FPGAs are also widely used for systems validation including pre-silicon

          validation post-silicon validation and firmware development This allows chip

          companies to validate their design before the chip is produced in the factory

          reducing the time to market

          124 FPGA Design and Programming

          To define the behavior of the FPGA the user provides a hardware description

          language (HDL) or a schematic design The HDL form might be easier to work

          with when handling large structures because its possible to just specify them

          numerically rather than having to draw every piece by hand On the other hand

          schematic entry can allow for easier visualisation of a design

          Then using an electronic design automation tool a technology-mapped netlist

          is generated The netlist can then be fitted to the actual FPGA architecture

          using a process called place-and-route usually performed by the FPGA

          companys proprietary place-and-route software The user will validate the

          map place and route results via timing analysis simulation and other

          verification methodologies Once the design and validation process is

          complete the binary file generated (also using the FPGA companys

          proprietary software) is used to (re)configure the FPGA

          Going from schematicHDL source files to actual configuration The source

          files are fed to a software suite from the FPGACPLD vendor that through

          different steps will produce a file This file is then transferred to the

          FPGACPLD via a serial interface (JTAG) or to an external memory device

          like an EEPROM

          The most common HDLs are VHDL and Verilog although in an attempt to

          reduce the complexity of designing in HDLs which have been compared to the

          equivalent of assembly languages there are moves to raise the abstraction level

          through the introduction of alternative languages

          62

          To simplify the design of complex systems in FPGAs there exist libraries of

          predefined complex functions and circuits that have been tested and optimized

          to speed up the design process These predefined circuits are commonly called

          IP cores and are available from FPGA vendors and third-party IP suppliers

          (rarely free and typically released under proprietary licenses) Other

          predefined circuits are available from developer communities such as Open

          Cores (typically free and released under the GPL BSD or similar license) and

          other sources

          In a typical design flow an FPGA application developer will simulate the

          design at multiple stages throughout the design process Initially the RTL

          description in VHDL or Verilog is simulated by creating test benches to

          simulate the system and observe results Then after the synthesis engine has

          mapped the design to a netlist the netlist is translated to a gate level

          description where simulation is repeated to confirm the synthesis proceeded

          without errors Finally the design is laid out in the FPGA at which point

          propagation delays can be added and the simulation run again with these

          values back-annotated onto the netlist

          Basic Process Technology Types

          SRAM - based on static memory technology In-system programmable and re-

          programmable Requires external boot devices CMOS

          Antifuse - One-time programmable CMOS

          EPROM - Erasable Programmable Read-Only Memory technology Usually

          one-time programmable in production because of plastic packaging

          Windowed devices can be erased with ultraviolet (UV) light CMOS

          EEPROM - Electrically Erasable Programmable Read-Only Memory

          technology Can be erased even in plastic packages Some but not all

          EEPROM devices can be in-system programmed CMOS

          Flash - Flash-erase EPROM technology Can be erased even in plastic

          packages Some but not all flash devices can be in-system programmed

          63

          Usually a flash cell is smaller than an equivalent EEPROM cell and is

          therefore less expensive to manufacture CMOS

          Fuse - One-time programmable Bipolar

          Major Manufacturers

          Xilinx and Altera are the current FPGA market leaders and long-time industry

          rivals Together they control over 80 percent of the market with Xilinx alone

          representing over 50 percent

          Xilinx also provides free Windows and Linux design software while Altera

          provides free Windows tools the Solaris and Linux tools are only available via

          a rental scheme

          Other competitors include Lattice Semiconductor (flash SRAM) Actel

          (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

          Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

          focused CSSP no general purpose FPGAs)

          125 FPGA prototype

          FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

          prototyping is the method to prototype SoC and ASIC design on FPGA for

          hardware verification and early software development

          Main stream verification methods for hardware design and early software and

          firmware co-design has become mainstream Prototyping SoC and ASIC

          design on FPGA has become a good method to do this

          64

          Reasons why Prototyping is important

          1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

          functionally correct This is compared to designers only relying on software

          simulations to verify that their hardware design is sound Simulation speed and

          modeling accuracy limitations hinder this development

          2 Due to time constrains many projects cannot wait until the silicon is back from

          the foundry to start on software tests FPGA prototyping allows for much more

          time in area of software development and testing at the software-hardware

          integration stage This allows many unforeseen software bugs that appear due

          to todays array of operating systems applications and hardware

          3 Prototyping also allows the developer to ensure that all IP technologies on his

          system work well together off the simulation stage and in actual form

          4 Prototyping has the added advantage as demo platforms to SoC clients

          bringing in interest early This speeds up the overall development cycle and

          allows for more enhancement or improvement to the chip features as it would

          otherwise have been

          65

          13 SIMULATION RESULTS

          66

          GATE LEVEL

          FigVGA CONTROLLER

          67

          GATE LEVEL

          Fig SYNC UNIT

          68

          TECHNOLOGY SCHEMATIC

          69

          14 APPLICATIONS

          Motion Tracking Systems

          Image processing systems

          Displaying Systems

          141 CONCLUTION

          Hence by implementing FPGA taking a processor-generated picture

          (pixilated) from memory space the Controller provides digital RGB values for

          each pixel as well as horizontal and vertical synchronization signals in order

          to correctly display the picture on a connected monitor

          70

          REFERENCES

          1 httpenwikipediaorgwikiVideo_Graphics_Array

          2 A VGA display controller by Eduardo Sanchez

          3 HinnerrdquoVGA Interface and video signal documentsrdquo

          71

          • Pixel
            • 6 SIGNALS IN VGA CONTROLLER
            • Gates
            • Market size
            • 122 FPGA Comparisons

            Pixels of image

            Display device also have pixels Display resolution (640480) is a

            standard resolution of display

            Pixel

            This article is about the picture element For other uses see Pixel

            (disambiguation)

            This example shows an image with a portion greatly enlarged in which the

            individual pixels are rendered as little squares and can easily be seen

            6

            42 A photograph of sub-pixel display elements on a laptops LCD screen

            In digital imaging a pixel (or picture element [1]) is a single point in a raster

            image The pixel is the smallest addressable screen element it is the smallest

            unit of picture which can be controlled Each Pixel has its address The address

            of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

            dimensional grid and are often represented using dots or squares Each pixel is

            a sample of an original image where more samples typically provide more-

            accurate representations of the original The intensity of each pixel is variable

            In color image systems a color is typically represented by three or four

            component intensities such as red green and blue or cyan magenta yellow

            and black

            Color refreshing rate

            Color refreshing rate =60Hz

            Color refreshing rate represents the number of frames that are transmitted per

            second

            Minimum number of frames that can be transmitted per second by VGA

            controller is 60

            7

            5 Hardware Description

            51 Block Diagram

            51 VGA Synchronization Unit

            8

            The Synchronization Unit provides the horizontal and vertical

            synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

            display a picture frame within the confines of a monitorrsquos display area

            These synchronization signals are used as control inputs by the

            monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

            electrons emitted by the three primary color electron guns (Red Green Blue)

            left to right and from top to bottom respectively HSYNC provides the start

            and stop times for the horizontal deflection circuit so that a line of pixels is

            correctly drawn across the screen display VSYNC provides the start and stop

            times for the vertical deflection circuit so that the lines of a frame are correctly

            drawn from the top to the bottom of the screen display

            The resolution for the display is defined by the level on the

            RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

            the 800x600 resolution (SVGA) is used

            Although the resolution determines the area of a monitorrsquos screen

            within which an image can be displayed the full extents of the chosen

            resolution do not have to be used The actual extents of the image display area

            on the screen can be controlled by the use of the DISPSIZE_H and

            DISPSIZE_V inputs These inputs determine the total number of pixels to be

            used in a line and the total number of lines in a frame respectively

            52 Blank pixel generation

            The total number of viewable pixels in a line and viewable lines in a

            frame is determined by the display resolution chosen through the

            RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

            the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

            chosen display resolution is used or not the areas of the monitor screen to the

            top bottom left and right of the viewable frame area are blanked by putting

            black pixels at the required line-pixel positions This has the effect of centering

            the image on the screen

            9

            The color generated for a pixel in the Pixel Unit depends on whether

            the particular pixel requires to be blanked or not The Synchronization Unit

            provides a signal to the Pixel Unit for this very reason This is the line display

            enable signal - en The signal is checked on each rising edge of the external

            clock signal (CLK) and is set as follows

            If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

            en = 0 (pixel requires to be blanked ndash set color to be black)

            Else

            en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

            6 SIGNALS IN VGA CONTROLLER

            For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

            timings are

            Parameter Value Unit

            Clock frequency 25175 MHz

            Horizontal pixels 640

            Horizontal sync polarity Negative

            Total time for each line 3177 micros

            Front porch (A) 094 micros

            Sync pulse length (B) 377 micros

            Back porch (C) 189 micros

            Active video (D) 2517 micros

            (Total horizontal sync time 660 micros)

            10

            61 The vertical timings are

            Parameter Value Unit

            Vertical lines 480

            Vertical sync polarity Negative

            Vertical frequency 5994 Hz

            Front porch (E) 035 ms

            Sync pulse length (F) 006 ms

            Back porch (G) 102 ms

            Active video (H) 1525 ms

            (Total vertical sync time 143 ms)

            640 x 400 70 Hz is video mode used for booting most x86 personal

            computers

            640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

            It should be noted that the actual timings vary slightly For example for

            640x480 60fps a 2517 micros active video time with a pixel frequency of

            25174 MHz gives 633 pixels rather than the expected 640 pixels

            62 Generation of the horizontal synchronization signal ndash HSYNC

            11

            The HSYNC signal is High (inactive) after an external reset signal

            (RST) is received by the VGA Controller The signal is updated on each rising

            edge of the external clock signal (CLK)

            The state of the HSYNC signal depends on the value stored in the horizontal

            counter and is driven low when

            HCOUNT ge (PixTotal + BlankingLeft)

            and remains low while

            HCOUNT lt (PixTotal + BlankingLeft + TLSync)

            Vertical (Frame) Period

            63 Generation of the vertical synchronization signal - VSYNC

            The VSYNC signal is High (inactive) after an external reset signal

            (RST) is received by the VGA Controller The signal is updated after every

            line of pixels is completed (ie on each rising edge of the HSYNC signal)

            The state of the VSYNC signal depends on the value stored in the vertical

            counter and is driven low when

            VCOUNT ge (LinTotal + BlankingLeft)

            and remains low while

            VCOUNT lt (LinTotal + BlankingLeft + TFSync)

            Address Counter (MEMADDR)

            The address counter is used to store the position of the next consecutive

            pixel in the frame Its value is passed to the Pixel Unit on the internal bus

            signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

            the next pixel from picture memory

            The counter is reset to zero when the VGA Controller receives an external

            reset signal (RST) The size of the counter depends on the values chosen for

            DISPSIZE_H and DISPSIZE_V as the range is simply

            12

            0 to (PixTotal x LinTotal) - 1

            Taking the maximum number of pixels in a line and lines in a frame for each

            of the supported display resolutions the maximum ranges for the counter are

            bull 640x480 0 to 307199

            bull 800x600 0 to 479999

            The counter has 19-bit resolution

            While the value in the horizontal counter (HCOUNT) is less than the

            total number of viewable pixels in a line (PixTotal the integer value of

            DISPSIZE_H) the counter is incremented on the rising edge of the external

            clock signal (CLK) Pixel addressing within the frame is consecutive When

            the counter reaches the last pixel in a line its incremented value is the first

            pixel in the next line down

            The address counter will continue to be incremented until the value in the

            vertical counter (VCOUNT) is greater than or equal to the total number of

            viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

            point it will be rolled over to zero

            13

            7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

            microcontroller and any VGA-compatible monitor This variant of the

            Controller provides six modes of display depending on the resolution chosen

            (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

            White 16 Colors or 64 Colors)

            71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

            For 1sec = 60 frames

            1 frame = max 160 sec

            1 pixel = 160 sec = 50ns

            307200

            Max Time for 1 pixel = 40ns

            2) To send information for each and every pixel of a frame by means of

            RGB signals

            3) After completely displaying 1 row information the column must be

            refreshed and the row has to be incremented ie done by Hsync signal

            4) After completing 1 frame of information a new frame has to be started

            for this row and column are to be refreshed ie done by Vsync signal

            14

            =gt For giving information for 1 pixel we use 1 clock (for controlling all

            operations)

            1 clock = 1 pixel

            1 clock cycle period = 40ns

            Clock frequency = 1time period =140ns = 25 mhz

            1 clock frequency = 25 mhz

            So if we use 25mhz clock and display 1 pixel information for each and enery

            clock then we use can attain min 60hz refreshing rate

            do not use the clock frequency less than 25mhz

            Ex

            15MHz 25MHz

            30MHz

            Time period = more effectively utilized frequency

            Less

            No of framessec = less

            More

            Decreases

            Increases

            (It is unable to display 60framessec)

            Signals used for image display-

            R (Red)

            15

            G (Green)

            B (Blue)

            HSYNC (Horizontal Synchronous signal)

            VSYNC (Vertical Synchronous signal)

            All the above signals are used to control the image data to be correctly

            displayed on the VGA display monitor

            72 Timings for front port sync pulse back port-

            Horizontal sync Vertical sync

            Front port 094micros Front port 035ms

            Sync pulse 377 micros Sync pulse 0063ms

            Back port 189 micros Back port 102ms

            16

            8 FUNCTIONAL DESCRIPTION81 Symbol

            VGA Controller Symbol

            Pin description

            The pin description is shown in the below table

            Table VGA Pin description

            Name Type Polarity Bus size Description

            Global Control Signals

            CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

            17

            640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

            RST I High Global system reset

            VGA Input Settings Signals

            RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

            CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

            DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

            DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

            Data Memory Control Signals

            RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

            18

            the internal line enable signal en generated by the Synchronization Unit of the Controller

            DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

            ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

            VGA Monitor Control Signals

            HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

            VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

            19

            VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

            R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

            G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

            S

            B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

            20

            9 VGA SIGNAL TIMING

            Figure 3 summarizes the signal timing involved in sending a line of

            pixels and a frame of lines The actual time values differ according to the

            resolution selected (640x480 or 800x600) the processor-defined values for

            DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

            (CLK)

            21

            91 VGA PIXEL UNIT

            The Pixel Unit provides access to the pixilated image stored in external

            picture memory reading in data a byte at a time and formatting each pixel to

            be displayed For each pixel the 6-bit RGB value required for the control of

            the monitorrsquos three primary color electron guns is generated so that the pixel

            is displayed on the screen with the correct color

            Accessing the stored image

            The image to be displayed on the monitor screen is written by the host

            microcontroller into external memory space (RAM) This memory space can

            be located anywhere (eg using a block of RAM within the FPGA design or

            using the SRAM on the NanoBoard itself)

            Pixel data is stored in the memory space in bytes The number of pixels in a

            byte depends on which color palette is being used for the image (selected by

            the CMOD input)

            bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

            bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

            bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

            (DATA[76] are not used)

            22

            The size of memory required to store a picture is determined by the total

            number of viewable pixels in a line (determined by DISPSIZE_H) the total

            number of viewable lines in a frame (determined by DISPSIZE_V) and the

            number of pixels stored in each byte in memory space

            Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

            The address in RAM where the next pixel is stored is determined using

            an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

            reflects the current contents of the MEMADDR register The exact addressing

            is described below

            92 Black amp White mode

            The picture memory address ndash the byte of data containing the next 8 pixels

            ndash is determined by using bits 183 of hvcnt and right shifting the contents by

            three

            ADDR_PIXEL = 00 amp hvcnt[183]

            16 Colors mode

            The picture memory address ndash the byte of data containing the next 2 pixels

            ndash is determined by using bits 181 of hvcnt and right shifting the contents by

            one

            ADDR_PIXEL = 0 amp hvcnt[181]

            64 Colors mode

            The picture memory address ndash the byte of data containing the next pixel ndash is

            determined by using the full value of hvcnt

            ADDR_PIXEL = hvcnt[180]

            The Pixel register (PIXREG)

            The Pixel register is used to receive the byte of pixel data read from the

            current address in memory space The register PIXREG is reset to zero when

            the VGA Controller receives an external reset signal (RST)

            23

            The register is updated on each rising edge of the CLK signal Data can be

            read from the memory space as long as the RD signal is active (High) The RD

            signal is itself controlled by the external line display enable signal enex This

            internally generated signal is defined as follows

            If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

            (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

            Else

            enex = 1

            When enex is Low read access from memory is disabled (RD = 0)

            The point at which data is loaded from memory into PIXREG depends on the

            particular color palette that is chosen ndash Black amp White 16 Colors or 64

            Colors

            Black amp White mode

            The next byte of data will be loaded into the register whenever the

            lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

            are ldquo000rdquo

            For the currently loaded byte the active pixel is always in the lowest bit

            position of the Pixel register Each pixel in the data byte is moved into this

            active pixel position by shifting the contents of the register right by one bit on

            each rising edge of CLK

            16 Colors mode

            The next byte of data will be loaded into the register whenever the

            lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

            0

            For the currently loaded byte the active pixel is always in the low order

            nibble of the Pixel register Remember that in this mode each byte of data

            contains two pixels The second pixel is moved into this active pixel position

            by shifting the contents of the register right by four bits on the rising edge of

            CLK

            24

            64 Colors mode

            The next byte of data will be loaded into the register on the rising edge

            of the external system clock signal (CLK) In this mode the read of pixel data

            does not depend on the status of the hvcnt signal received from the

            Synchronization Unit

            93 The RGB register (RGBREG)

            The RGB register is used to store the six bits that are required for

            driving the red green and blue color guns of the target monitor When the

            chosen color palette is either Black amp White or 16 Colors these six bits are

            obtained by mapping the value of the active pixel to a predefined RGB code

            When the chosen color palette is 64 Colors the actual pixel value is used

            directly

            The register RGBREG is reset to zero (000000) when the VGA

            Controller receives an external reset signal (RST) This RGB code represents

            black

            The register is updated on each rising edge of the CLK signal and the

            value loaded is dependent on the state of the line display enable signal en

            When en is Low blanking is required and RGBREG is loaded with the code

            for black (000000)

            Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

            and a 4-bit pixel (16 Colors mode) into the required RGB color code

            25

            26

            27

            The RGB color code stored in the RGB register is output from the VGA

            Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

            and B1)

            The monitor itself expects analog signals as inputs to its electron gun control

            circuits This is achieved by using 2-bit digital to analog converter circuitry

            located on the Nano Board itself as shown in Figure 4

            Figure 91 Digital to Analog RGB conversion

            28

            For each color the 2-bit digital signal from the VGA Controller can be

            converted into 4 distinct analog levels These levels specify the intensity of

            each of the three primary colors to use when displaying the pixel on the

            monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

            brightness)

            With each analog input being one of four possible levels the monitor

            can display each pixel on the screen with one of 64 different color

            permutations

            Using Block RAM in Spartan-3E Generation

            FPGAs

            For applications requiring large on-chip memories Spartantrade-3

            Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

            Using various configuration options Select RAM blocks create RAM ROM

            FIFOs large look-up tables data width converterscircular buffers and shift

            registers each supporting various data widths and depths This application note

            describes the features and capabilities of block Select RAM and illustrates how

            to specify the various options using the Xilinx CORE Generatortrade system or

            via VHDL or Verilog instantiation

            29

            Table 2 Block RAM Available in Spartan-3E Devices

            Devi

            ce

            R

            A

            M

            Co

            lu

            m

            ns

            R

            A

            M

            Bl

            oc

            ks

            pe

            r

            C

            ol

            u

            m

            n

            T

            o

            t

            a

            l

            R

            A

            M

            B

            l

            o

            c

            k

            s

            T

            o

            t

            a

            l

            R

            A

            M

            B

            it

            s

            T

            o

            t

            a

            l

            R

            A

            M

            K

            b

            i

            t

            s

            XC3

            S100

            E

            1 4 4 7

            3

            7

            2

            8

            7

            2

            XC3

            S250

            E

            2 6 1

            2

            2

            2

            1

            1

            8

            4

            2

            1

            6

            XC3

            S500

            E

            2 10 2

            0

            3

            6

            8

            6

            4

            3

            6

            0

            30

            0

            XC3

            S120

            0E

            2 14 2

            8

            5

            1

            6

            0

            9

            6

            5

            0

            4

            XC3

            S160

            0E

            2 18 3

            6

            6

            6

            3

            5

            5

            2

            6

            4

            8

            Each block RAM contains 18432 bits of fast static RAM 16K bits of

            which is allocated to data storage and in some memory configurations an

            additional 2K bits allocated to parity or additional plus data bits Physically

            the block RAM memory has two completely independent access ports labeled

            Port A and Port B The structure is fully symmetrical and both ports are

            interchangeable and both ports support data read and write operations Each

            memory port is synchronous with its own clock clock enable and write

            enable Read operations are also synchronous and require a clock edge and

            clock enable Though physically a dual-port memory block RAM simulates

            single-port memory in an application as shown in Figure 1 Furthermore each

            block memory supports multiple configurations or aspect ratios Table 3

            summarizes the essential SelectRAM features Cascade multiple block RAMs

            to create deeper and wider memory organizations with a minimal timing

            penalty incurred through specialized routing resources

            31

            32

            10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

            Why (V) HDL

            Interoperability

            Technology independence

            Design reuse

            Several levels of abstraction

            Readability

            Standard language

            Widely supported

            What is VHDL

            VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

            IC)

            Design specification language

            Design entry language

            Design simulation language

            Design documentation language

            An alternative to schematics

            101 Brief History

            VHDL Was developed in the early 1980s for managing design problems that

            involved large circuits and multiple teams of engineers

            Funded by US Department of Defence

            33

            The first publicly available version was released in 1985

            In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

            presented with a proposal to standardize the VHDL

            In 1987 standardization =gt IEEE 1076-1987

            An improved version of the language was released in 1994 =gt IEEE

            standard1076-1993

            102 Related Standards

            IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

            impedance

            Soon after IEEE 1076-1987 was released simulator companies began using

            their own non-standard types =gt VHDL was becoming a nonstandard

            IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

            definitions for a nine-valued data type std_logic

            IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

            to actual hardware

            Defines eg two numeric types signed and unsigned

            VHDL Environment

            Design Units

            34

            Segments of VHDL code that can be compiled separately and stored in a

            library

            Entities

            A black box with interface definition

            Defines the inputsoutputs of a component (define pins)

            A way to represent modularity in VHDL

            Similar to symbol in schematic

            Entity declaration describes entity

            Eg

            entity Comparator is

            port (A B in std_logic_vector(7 downto0)

            EQ out std_logic)

            end Comparator

            Ports Provide channels of communication between the component and its

            environment

            Each port must have a name direction and a type

            An entity may have NO port declaration

            35

            Port directions In A value of a port can be read inside the component but cannot be assigned

            Multiple reads of port are allowed

            Out Assignments can be made to a port but data from a port cannot be read

            Multiple assignments are allowed

            In out Bi-directional assignments can be made and data can be read Multiple

            assignments are allowed

            Buffer An out port with read capability May have at most one assignment

            (are not recommended)

            Architectures Every entity has at least one architecture

            One entity can have several architectures

            Architectures can describe design using

            BehaviorndashStructurendashDataflow

            Architectures can describe design on many levelsndashGate levelndashRTL (Register

            Transfer Level)ndashBehavioral level

            Configuration declaration links architecture to entity

            Eg

            Architecture Comparator1 of Comparator is

            Begin

            EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

            End Comparator1

            Configurations Links entity declaration and architecture body together

            Concept of default configuration is a bit messy in VHDL lsquo87

            ndashLast architecture analyzed links to entity

            Can be used to change simulation behavior without re-analyzing the VHDL

            source

            Complex configuration declarations are ignored in synthesis

            36

            Some entities can have eggate level architecture and behavioral

            architecture

            Are always optional

            PackagesPackages contain information common to many design units

            1 Package declaration

            --constant declarations

            ndashtype and subtype declarations

            ndashfunction and procedure declarations

            ndashglobal signal declarations

            ndashfile declarations

            ndashcomponent declarations

            2 Package body

            ndashis not necessary needed

            ndashfunction bodies

            ndashprocedure bodies

            Packages are meant for encapsuling data which can be shared globally among

            Several design units

            Consists of declaration part and optional body part

            Package declaration can contain

            ndashtype and subtype declarations

            ndashsubprograms

            ndashconstants

            Alias declarations

            ndashglobal signal declarations

            ndashfile declarations

            ndashcomponent declarations

            37

            Package body consists of

            ndashsubprogram declarations and bodies

            ndashtype and subtype declarations

            ndash deferred constants

            ndash file declarations

            LibrariesCollection of VHDL design units (database)

            1 Packages

            package declaration

            package body

            2 Entities (entity declaration)

            3 Architectures (architecture body)

            4 Configurations (configuration declarations)

            Usually directory in UNIX file system

            Can be also any other kind of database

            Levels of Abstraction

            VHDL supports many possible styles of design description which differ

            primarily in how closely they relate to the HW

            It is possible to describe a circuit in a number of ways

            Structural-------

            Dataflow ------- Higher level of abstraction

            Behavioral -------

            Structural VHDL description

            Circuit is described in terms of its components

            From a low-level description (eg transistor-level description) to a high level

            description (eg block diagram)

            For large circuits a low-level description quickly becomes impractical

            103 Dataflow VHDL Description

            38

            Circuit is described in terms of how data moves through the system

            In the dataflow style you describe how information flows between registers in

            the system

            The combinational logic is described at a relatively high level the placement

            and operation of registers is specified quite precisely

            The behavior of the system over the time is defined by registers

            There are no build-in registers in VHDL-language

            ndashEither lower level description

            ndashor behavioral description of sequential elements is needed

            The lower level register descriptions must be created or obtained

            If there is no 3rd party models for registers =gt you must write the behavioral

            description of registers

            The behavioral description can be provided in the form of

            subprograms(functions or procedures)

            104 Behavioral VHDL Description

            Circuit is described in terms of its operation over time

            39

            Representation might include eg state diagrams timing diagrams and

            algorithmic descriptions

            The concept of time may be expressed precisely using delays (eg A lt= B

            after 10 ns)

            If no actual delay is used order of sequential operations is defined

            In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

            timing specifications

            The actual timing results depend on implementation technology and efficiency

            of synthesis tool

            There are a few tools for behavioral synthesis

            Concurrent Vs Sequential

            Processes

            Basic simulation concept in VHDL

            VHDL description can always be broken up to interconnected processes

            Quite similar to Unix process

            40

            Process keyword in VHDL

            Process statement is concurrent statement

            Statements inside process statements are sequential statements

            Process must contain either sensitivity list or wait statement(s) but NOT both

            Sensitivity list or wait statement(s) contains signals which wakes process up

            General format

            Process [(sensitivity_list)]

            process_declarative_part

            Begin

            process_statements

            [wait_statement]

            End process

            41

            11 VGA CONTROLLER CODE

            library IEEE

            use IEEESTD_LOGIC_1164ALL

            use IEEESTD_LOGIC_ARITHALL

            use IEEESTD_LOGIC_UNSIGNEDALL

            use IEEEnumeric_stdALL

            ---- Uncomment the following library declaration if instantiating

            ---- any Xilinx primitives in this code

            --library UNISIM

            --use UNISIMVComponentsall

            entity vga_controller is port(clk50min std_logic

            rstnin std_logic

            hsyncout std_logic

            vsyncout std_logic

            redout std_logic_vector(0 downto 0)

            greenout std_logic_vector(0 downto 0)

            blueout std_logic_vector(0 downto 0))

            end vga_controller

            architecture Behavioral of vga_controller is

            component memory

            port (

            addr IN std_logic_VECTOR(17 downto 0)

            clk IN std_logic

            dout OUT std_logic_VECTOR(0 downto 0)

            en IN std_logic

            sinit IN std_logic)

            END component

            42

            component sync_unit port(

            clkin std_logic

            rstnin std_logic

            hsyncout std_logic

            ---horizontal synch pulse(row)

            vsyncout std_logic

            --vertical sync pulse(frame) 111110100000000000

            v_en_regout std_logic

            h_enout std_logic

            v_enout std_logic

            line_countinout integer

            c_hsinout integer)

            end component

            component pixelgen port(

            clkin std_logic

            rstnin std_logic

            hsyncin std_logic

            vsyncin std_logic

            h_enin std_logic

            v_enin std_logic

            v_en_reg in std_logic

            data_inin std_logic_vector(0 downto 0)

            line_countin integer

            c_hsin integer

            red out std_logic_vector(0 downto 0)

            greenout std_logic_vector(0 downto 0)

            blueout std_logic_vector(0 downto 0)

            addrout std_logic_vector(17 downto 0)

            rdout std_logic)

            end component

            43

            signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

            signal line_count_sc_hs_saddrinteger

            signal data_sstd_logic_vector(0 downto 0)

            signal addr_sstd_logic_vector(17 downto 0)

            begin

            --addr_s lt= std_logic_vector(to_unsigned(addr18))

            process(clk50mrstn)

            begin

            if(rstn = 0)then

            clk lt= 0

            elsif(clk50m = 1 and clk50mevent)then

            clk lt= not clk

            end if

            --end if

            end process

            hsync lt= hsync_s

            vsync lt= vsync_s

            sync sync_unit port map(

            clk =gt clk

            rstn =gt rstn

            hsync =gt hsync_s

            vsync =gt vsync_s

            v_en_reg =gt v_en_reg_s

            line_count =gt line_count_s

            h_en =gt h_en_s

            v_en =gt v_en_s

            c_hs =gt c_hs_s

            )

            44

            pixel pixelgen port map(

            clk =gt clk

            rstn =gt rstn

            hsync =gt hsync_s

            vsync =gt vsync_s

            h_en =gt h_en_s

            v_en =gt v_en_s

            v_en_reg =gt v_en_reg_s

            data_in =gt data_s

            line_count=gt line_count_s

            c_hs =gt c_hs_s

            red =gt red

            green =gt green

            blue =gt blue

            addr =gt addr_s

            rd =gt rd_s

            )

            mem12 memory port map(

            addr =gt addr_s

            clk =gt clk

            dout =gt data_s

            en =gt rd_s

            sinit=gt rstn

            )

            end Behavioral

            45

            SYNCHRONIZATION BLOCK CODE

            library IEEE

            use IEEESTD_LOGIC_1164ALL

            use IEEESTD_LOGIC_ARITHALL

            use IEEESTD_LOGIC_UNSIGNEDALL

            entity sync_unit is port(

            clkin std_logic

            rstnin std_logic

            hsyncout std_logic ---horizontal synch pulse(row)

            vsyncout std_logic --vertical sync

            pulse(frame) 111110100000000000

            v_en_regout std_logic

            h_enout std_logic

            v_enout std_logic

            line_countout integer

            c_hsout integer )

            end sync_unit

            architecture arch_sync_unit of sync_unit is

            signal h_en_sv_en_sstd_logic

            signal line_count_sc_hs_sinteger

            begin

            c_hs lt= c_hs_s

            line_count lt= line_count_s

            h_en lt= h_en_s

            v_en lt= v_en_s

            -- hsync counter

            process(clkrstn)

            begin

            if(rstn = 0)then

            46

            c_hs_s lt= 0

            else

            if(clk = 1 and clkevent)then

            if(c_hs_s=793)then

            c_hs_s lt= 0

            else

            c_hs_s lt= c_hs_s +1

            end if

            end if

            end if

            end process

            --vsync counter

            --vertical line counter

            process(h_en_srstn)

            begin

            if(rstn = 0)then

            line_count_s lt= 0

            elsif(h_en_s = 1 and h_en_sevent)then

            line_count_s lt= line_count_s + 1

            if (line_count_s=515)then

            line_count_s lt= 0

            end if

            end if

            --end if

            end process

            --hysnc pulse

            process(clkrstn)

            begin

            if(rstn = 0)then

            47

            hsync lt= 1

            elsif(clk = 1 and clkevent)then

            if(c_hs_s lt=95)then

            hsync lt= 0

            else

            hsync lt= 1

            end if

            end if

            --end if

            end process

            process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

            this period)

            begin

            if(rstn = 0)then

            h_en_s lt= 0

            elsif(clk = 1 and clkevent)then

            if((c_hs_s gt=143) and (c_hs_slt=778))then

            h_en_s lt= 1

            else

            h_en_s lt= 0

            end if

            end if

            --end if

            end process

            ---vysnc pulse

            process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

            begin

            if(rstn = 0)then

            48

            v_en_s lt= 0

            elsif(clk = 1 and clkevent)then

            if((line_count_sgt=35 and line_count_slt=515))then

            v_en_s lt= 1

            else

            v_en_s lt= 0

            end if

            end if

            --end if

            end process

            process(clkrstn) --Vertical Sync Front porch Back Porch

            begin

            if(rstn = 0)then

            vsync lt= 1

            elsif(clk = 1 and clkevent)then

            if(line_count_s lt2)then

            vsync lt= 0

            else

            vsync lt= 1

            end if

            end if

            --end if

            end process

            process(clk)

            begin

            if(clk = 1 and clkevent)then

            v_en_reglt=v_en_s

            end if

            end process

            end arch_sync_unit

            49

            PIXEL GENERATOR

            library IEEE

            use IEEESTD_LOGIC_1164ALL

            use IEEESTD_LOGIC_ARITHALL

            use IEEESTD_LOGIC_UNSIGNEDALL

            entity pixelgen is port(

            clkin std_logic

            rstnin std_logic

            hsyncin std_logic

            vsyncin std_logic

            h_enin std_logic

            v_enin std_logic

            v_en_reg in std_logic

            data_inin std_logic_vector(0 downto 0)

            line_countin integer

            c_hsin integer

            red out std_logic_vector(0 downto 0)

            greenout std_logic_vector(0 downto 0)

            blueout std_logic_vector(0 downto 0)

            addrout std_logic_vector(17 downto 0)

            rdout std_logic

            )

            end pixelgen

            architecture behv of pixelgen is

            signal addr_sstd_logic_vector(17 downto 0)

            begin

            addr lt= addr_s

            process(clkrstn)

            begin

            50

            if(rstn = 0)then

            rd lt= 0

            red lt= 0

            green lt= 0

            blue lt= 0

            addr_s lt= x0000 amp 00

            elsif (clk = 1 and clkevent)then

            if((v_en=1) and (v_en_reg=0))then

            addr_slt=(others =gt0)

            end if

            if(v_en=1 and h_en=1)then

            if(line_countlt=514)then

            if(c_hs lt= 655)then

            rd lt= 1

            red lt= data_in

            green lt= data_in

            blue lt= data_in

            addr_s lt=addr_s +000000000000000001

            else

            rd lt= 0

            red lt= 0

            green lt= 1

            blue lt= 0

            end if

            else

            rd lt= 0

            red lt= 1

            green lt= 1

            blue lt= 1

            51

            end if

            if(addr_s=111011111111111111)then

            addr_slt= (others =gt0)

            end if

            else

            red lt= 0

            green lt= 0

            blue lt= 0

            end if

            end if

            -- end if

            --end if

            end process

            end behv

            52

            12 FPGA ARCHITECTURE

            A field-programmable gate array (FPGA) is a semiconductor device that can

            be configured by the customer or designer after manufacturingmdashhence the

            name field-programmable FPGAs are programmed using a logic circuit

            diagram or a source code in a hardware description language (HDL) to specify

            how the chip will work They can be used to implement any logical function

            that an application-specific integrated circuit (ASIC) could perform but the

            ability to update the functionality after shipping offers advantages for many

            applications

            FPGAs contain programmable logic components called logic blocks and a

            hierarchy of reconfigurable interconnects that allow the blocks to be wired

            togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

            can be configured to perform complex combinational functions or merely

            simple logic gates like AND and XOR In most FPGAs the logic blocks also

            include memory elements which may be simple flip-flops or more complete

            blocks of memory

            History The FPGA industry sprouted from programmable read only

            memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

            both had the option of being programmed in batches in a factory or in the field

            (field programmable) however programmable logic was hard-wired between

            logic gates

            Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

            first commercially viable field programmable gate array in 1985 ndash the

            XC2064 The XC2064 had programmable gates and programmable

            interconnects between gates the beginnings of a new technology and market

            The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

            input lookup tables (LUTs) More than 20 years later Freeman was entered

            into the National Inventors Hall of Fame for his invention

            53

            Some of the industryrsquos foundational concepts and technologies for

            programmable logic arrays gates and logic blocks are founded in patents

            awarded to David W Page and LuVerne R Peterson in 1985

            In the late 1980s the Naval Surface Warfare Department funded an experiment

            proposed by Steve Casselman to develop a computer that would implement

            600000 reprogrammable gates Casselman was successful and the system was

            awarded a patent in 1992

            Xilinx continued unchallenged and quickly growing from 1985 to the mid-

            1990s when competitors sprouted up eroding significant market-share By

            1993 Actel was serving about 18 percent of the market

            The 1990s were an explosive period of time for FPGAs both in sophistication

            and the volume of production In the early 1990s FPGAs were primarily used

            in telecommunications and networking By the end of the decade FPGAs

            found their way into consumer automotive and industrial applications

            FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

            algorithm technology and FPGAs to create a sound recognition device

            Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

            to decide the configuration needed to accomplish a sound recognition task

            121 Modern developments

            A recent trend has been to take the coarse-grained architectural

            approach a step further by combining the logic blocks and interconnects of

            traditional FPGAs with embedded microprocessors and related peripherals to

            form a complete system on a programmable chip This work mirrors the

            architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

            Group which combined a reconfigurable CPU architecture on a single chip

            called the SB24 That work was done in 1982 Examples of such hybrid

            technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

            which include one or more PowerPC processors embedded within the FPGAs

            54

            logic fabric The Atmel FPSLIC is another such device which uses an AVR

            processor in combination with Atmels programmable logic architecture

            An alternate approach to using hard-macro processors is to make use of soft

            processor cores that are implemented within the FPGA logic (See Soft

            processors below)

            As previously mentioned many modern FPGAs have the ability to be

            reprogrammed at run time and this is leading to the idea of reconfigurable

            computing or reconfigurable systems mdash CPUs that reconfigure themselves to

            suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

            example of a reconfigurable soft processor implemented on FPGAs However

            it does not support dynamic reconfiguration at runtime but instead adapts itself

            to a specific program

            Additionally new non-FPGA architectures are beginning to emerge

            Software-configurable microprocessors such as the Stretch S5000 adopt a

            hybrid approach by providing an array of processor cores and FPGA-like

            programmable cores on the same chip

            Gates

            1987 9000 gates Xilinx

            1992 600000 Naval Surface Warfare Department

            Early 2000s Millions

            Market size

            1985 First commercial FPGA technology invented by Xilinx

            1987 $14 million

            ~1993 gt$385 million

            2005 $19 billion

            2010 estimates $275 billion

            55

            122 FPGA Comparisons

            Historically FPGAs have been slower less energy efficient and generally

            achieved less functionality than their fixed ASIC counterparts A combination

            of volume fabrication improvements research and development and the IO

            capabilities of new supercomputers have largely closed the performance gap

            between ASICs and FPGAs

            Advantages include a shorter time to market ability to re-program in the field

            to fix bugs and lower non-recurring engineering costs Vendors can also take a

            middle road by developing their hardware on ordinary FPGAs but

            manufacture their final version so it can no longer be modified after the design

            has been committed

            Xilinx claims that several market and technology dynamics are changing the

            ASICFPGA paradigm

            IC costs are rising aggressively

            ASIC complexity has bolstered development time and costs

            RampD resources and headcount is decreasing

            Revenue losses for slow time-to-market are increasing

            Financial constraints in a poor economy are driving low-cost technologies

            These trends make FPGAs a better alternative than ASICs for a growing

            number of higher-volume applications than they have been historically used

            for which the company blames for the growing number of FPGA design starts

            (see History)

            The primary differences between CPLDs and FPGAs are architectural A

            CPLD has a somewhat restrictive structure consisting of one or more

            programmable sum-of-products logic arrays feeding a relatively small number

            of clocked registers The result of this is less flexibility with the advantage of

            more predictable timing delays and a higher logic-to-interconnect ratio The

            FPGA architectures on the other hand are dominated by interconnect This

            56

            makes them far more flexible (in terms of the range of designs that are

            practical for implementation within them) but also far more complex to design

            for

            Another notable difference between CPLDs and FPGAs is the presence in most

            FPGAs of higher-level embedded functions (such as adders and multipliers)

            and embedded memories as well as to have logic blocks implement decoders

            or mathematical functions

            Some FPGAs have the capability of partial re-configuration that lets one

            portion of the device be re-programmed while other portions continue running

            123 Applications

            Applications of FPGAs include digital signal processing software-defined

            radio aerospace and defense systems ASIC prototyping medical imaging

            computer vision speech recognition cryptography bioinformatics computer

            hardware emulation radio astronomy and a growing range of other areas

            FPGAs originally began as competitors to CPLDs and competed in a similar

            space that of glue logic for PCBs As their size capabilities and speed

            increased they began to take over larger and larger functions to the state where

            some are now marketed as full systems on chips (SoC) Particularly with the

            introduction of dedicated multipliers into FPGA architectures in the late 1990s

            applications which had traditionally been the sole reserve of DSPs began to

            incorporate FPGAs instead

            FPGAs especially find applications in any area or algorithm that can make use

            of the massive parallelism offered by their architecture One such area is code

            breaking in particular brute-force attack of cryptographic algorithms

            FPGAs are increasingly used in conventional high performance computing

            applications where computational kernels such as FFT or Convolution are

            performed on the FPGA instead of a microprocessor

            57

            The inherent parallelism of the logic resources on an FPGA allows for

            considerable computational throughput even at a low MHz clock rates The

            flexibility of the FPGA allows for even higher performance by trading off

            precision and range in the number format for an increased number of parallel

            arithmetic units This has driven a new type of processing called

            reconfigurable computing where time intensive tasks are offloaded from

            software to FPGAs

            The adoption of FPGAs in high performance computing is currently limited by

            the complexity of FPGA design compared to conventional software and the

            extremely long turn-around times of current design tools where 4-8 hours wait

            is necessary after even minor changes to the source code

            Traditionally FPGAs have been reserved for specific vertical applications

            where the volume of production is small For these low-volume applications

            the premium that companies pay in hardware costs per unit for a

            programmable chip is more affordable than the development resources spent

            on creating an ASIC for a low-volume application Today new cost and

            performance dynamics have broadened the range of viable applications

            Architecture

            The most common FPGA architecture consists of an array of configurable

            logic blocks (CLBs) IO pads and routing channels Generally all the routing

            channels have the same width (number of wires) Multiple IO pads may fit

            into the height of one row or the width of one column in the array

            An application circuit must be mapped into an FPGA with adequate resources

            While the number of CLBs and IOs required is easily determined from the

            design the number of routing tracks needed may vary considerably even

            among designs with the same amount of logic (For example a crossbar switch

            requires much more routing than a systolic array with the same gate count)

            Since unused routing tracks increase the cost (and decrease the performance)

            of the part without providing any benefit FPGA manufacturers try to provide

            58

            just enough tracks so that most designs that will fit in terms of LUTs and IOs

            can be routed This is determined by estimates such as those derived from

            Rents rule or by experiments with existing designs

            The FPGA is an array or island-style FPGA It consists of an array of logic

            blocks and routing channels Two IO pads fit into the height of one row or the

            width of one column as shown below All the routing channels have the same

            width (number of wires)

            121 FPGA structure

            A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

            flip-flop as shown below In recent years manufacturers have started moving

            to 6-input LUTs in their high performance parts claiming increased

            performance

            Typical logic block

            59

            There is only one output which can be either the registered or the unregistered

            LUT output The logic block has four inputs for the LUT and a clock input

            Since clock signals (and often other high-fanout signals) are normally routed

            via special-purpose dedicated routing networks in commercial FPGAs they

            and other signals are separately managed

            For this example architecture the locations of the FPGA logic block pins are

            shown below

            Logic Block Pin Locations

            Each input is accessible from one side of the logic block while the output pin

            can connect to routing wires in both the channel to the right and the channel

            below the logic block

            Each logic block output pin can connect to any of the wiring segments in the

            channels adjacent to it

            Similarly an IO pad can connect to any one of the wiring segments in the

            channel adjacent to it For example an IO pad at the top of the chip can

            connect to any of the W wires (where W is the channel width) in the horizontal

            channel immediately below it

            Generally the FPGA routing is unsegmented That is each wiring segment

            spans only one logic block before it terminates in a switch box By turning on

            60

            some of the programmable switches within a switch box longer paths can be

            constructed For higher speed interconnect some FPGA architectures use

            longer routing lines that span multiple logic blocks

            Whenever a vertical and a horizontal channel intersect there is a switch box In

            this architecture when a wire enters a switch box there are three

            programmable switches that allow it to connect to three other wires in adjacent

            channel segments The pattern or topology of switches used in this

            architecture is the planar or domain-based switch box topology In this switch

            box topology a wire in track number one connects only to wires in track

            number one in adjacent channel segments wires in track number 2 connect

            only to other wires in track number 2 and so on The figure below illustrates

            the connections in a switch box

            Switch box topology

            Modern FPGA families expand upon the above capabilities to include higher

            level functionality fixed into the silicon Having these common functions

            embedded into the silicon reduces the area required and gives those functions

            increased speed compared to building them from primitives Examples of these

            include multipliers generic DSP blocks embedded processors high speed IO

            logic and embedded memories

            61

            FPGAs are also widely used for systems validation including pre-silicon

            validation post-silicon validation and firmware development This allows chip

            companies to validate their design before the chip is produced in the factory

            reducing the time to market

            124 FPGA Design and Programming

            To define the behavior of the FPGA the user provides a hardware description

            language (HDL) or a schematic design The HDL form might be easier to work

            with when handling large structures because its possible to just specify them

            numerically rather than having to draw every piece by hand On the other hand

            schematic entry can allow for easier visualisation of a design

            Then using an electronic design automation tool a technology-mapped netlist

            is generated The netlist can then be fitted to the actual FPGA architecture

            using a process called place-and-route usually performed by the FPGA

            companys proprietary place-and-route software The user will validate the

            map place and route results via timing analysis simulation and other

            verification methodologies Once the design and validation process is

            complete the binary file generated (also using the FPGA companys

            proprietary software) is used to (re)configure the FPGA

            Going from schematicHDL source files to actual configuration The source

            files are fed to a software suite from the FPGACPLD vendor that through

            different steps will produce a file This file is then transferred to the

            FPGACPLD via a serial interface (JTAG) or to an external memory device

            like an EEPROM

            The most common HDLs are VHDL and Verilog although in an attempt to

            reduce the complexity of designing in HDLs which have been compared to the

            equivalent of assembly languages there are moves to raise the abstraction level

            through the introduction of alternative languages

            62

            To simplify the design of complex systems in FPGAs there exist libraries of

            predefined complex functions and circuits that have been tested and optimized

            to speed up the design process These predefined circuits are commonly called

            IP cores and are available from FPGA vendors and third-party IP suppliers

            (rarely free and typically released under proprietary licenses) Other

            predefined circuits are available from developer communities such as Open

            Cores (typically free and released under the GPL BSD or similar license) and

            other sources

            In a typical design flow an FPGA application developer will simulate the

            design at multiple stages throughout the design process Initially the RTL

            description in VHDL or Verilog is simulated by creating test benches to

            simulate the system and observe results Then after the synthesis engine has

            mapped the design to a netlist the netlist is translated to a gate level

            description where simulation is repeated to confirm the synthesis proceeded

            without errors Finally the design is laid out in the FPGA at which point

            propagation delays can be added and the simulation run again with these

            values back-annotated onto the netlist

            Basic Process Technology Types

            SRAM - based on static memory technology In-system programmable and re-

            programmable Requires external boot devices CMOS

            Antifuse - One-time programmable CMOS

            EPROM - Erasable Programmable Read-Only Memory technology Usually

            one-time programmable in production because of plastic packaging

            Windowed devices can be erased with ultraviolet (UV) light CMOS

            EEPROM - Electrically Erasable Programmable Read-Only Memory

            technology Can be erased even in plastic packages Some but not all

            EEPROM devices can be in-system programmed CMOS

            Flash - Flash-erase EPROM technology Can be erased even in plastic

            packages Some but not all flash devices can be in-system programmed

            63

            Usually a flash cell is smaller than an equivalent EEPROM cell and is

            therefore less expensive to manufacture CMOS

            Fuse - One-time programmable Bipolar

            Major Manufacturers

            Xilinx and Altera are the current FPGA market leaders and long-time industry

            rivals Together they control over 80 percent of the market with Xilinx alone

            representing over 50 percent

            Xilinx also provides free Windows and Linux design software while Altera

            provides free Windows tools the Solaris and Linux tools are only available via

            a rental scheme

            Other competitors include Lattice Semiconductor (flash SRAM) Actel

            (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

            Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

            focused CSSP no general purpose FPGAs)

            125 FPGA prototype

            FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

            prototyping is the method to prototype SoC and ASIC design on FPGA for

            hardware verification and early software development

            Main stream verification methods for hardware design and early software and

            firmware co-design has become mainstream Prototyping SoC and ASIC

            design on FPGA has become a good method to do this

            64

            Reasons why Prototyping is important

            1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

            functionally correct This is compared to designers only relying on software

            simulations to verify that their hardware design is sound Simulation speed and

            modeling accuracy limitations hinder this development

            2 Due to time constrains many projects cannot wait until the silicon is back from

            the foundry to start on software tests FPGA prototyping allows for much more

            time in area of software development and testing at the software-hardware

            integration stage This allows many unforeseen software bugs that appear due

            to todays array of operating systems applications and hardware

            3 Prototyping also allows the developer to ensure that all IP technologies on his

            system work well together off the simulation stage and in actual form

            4 Prototyping has the added advantage as demo platforms to SoC clients

            bringing in interest early This speeds up the overall development cycle and

            allows for more enhancement or improvement to the chip features as it would

            otherwise have been

            65

            13 SIMULATION RESULTS

            66

            GATE LEVEL

            FigVGA CONTROLLER

            67

            GATE LEVEL

            Fig SYNC UNIT

            68

            TECHNOLOGY SCHEMATIC

            69

            14 APPLICATIONS

            Motion Tracking Systems

            Image processing systems

            Displaying Systems

            141 CONCLUTION

            Hence by implementing FPGA taking a processor-generated picture

            (pixilated) from memory space the Controller provides digital RGB values for

            each pixel as well as horizontal and vertical synchronization signals in order

            to correctly display the picture on a connected monitor

            70

            REFERENCES

            1 httpenwikipediaorgwikiVideo_Graphics_Array

            2 A VGA display controller by Eduardo Sanchez

            3 HinnerrdquoVGA Interface and video signal documentsrdquo

            71

            • Pixel
              • 6 SIGNALS IN VGA CONTROLLER
              • Gates
              • Market size
              • 122 FPGA Comparisons

              42 A photograph of sub-pixel display elements on a laptops LCD screen

              In digital imaging a pixel (or picture element [1]) is a single point in a raster

              image The pixel is the smallest addressable screen element it is the smallest

              unit of picture which can be controlled Each Pixel has its address The address

              of a pixel corresponds to its coordinate Pixels are normally arranged in a 2-

              dimensional grid and are often represented using dots or squares Each pixel is

              a sample of an original image where more samples typically provide more-

              accurate representations of the original The intensity of each pixel is variable

              In color image systems a color is typically represented by three or four

              component intensities such as red green and blue or cyan magenta yellow

              and black

              Color refreshing rate

              Color refreshing rate =60Hz

              Color refreshing rate represents the number of frames that are transmitted per

              second

              Minimum number of frames that can be transmitted per second by VGA

              controller is 60

              7

              5 Hardware Description

              51 Block Diagram

              51 VGA Synchronization Unit

              8

              The Synchronization Unit provides the horizontal and vertical

              synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

              display a picture frame within the confines of a monitorrsquos display area

              These synchronization signals are used as control inputs by the

              monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

              electrons emitted by the three primary color electron guns (Red Green Blue)

              left to right and from top to bottom respectively HSYNC provides the start

              and stop times for the horizontal deflection circuit so that a line of pixels is

              correctly drawn across the screen display VSYNC provides the start and stop

              times for the vertical deflection circuit so that the lines of a frame are correctly

              drawn from the top to the bottom of the screen display

              The resolution for the display is defined by the level on the

              RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

              the 800x600 resolution (SVGA) is used

              Although the resolution determines the area of a monitorrsquos screen

              within which an image can be displayed the full extents of the chosen

              resolution do not have to be used The actual extents of the image display area

              on the screen can be controlled by the use of the DISPSIZE_H and

              DISPSIZE_V inputs These inputs determine the total number of pixels to be

              used in a line and the total number of lines in a frame respectively

              52 Blank pixel generation

              The total number of viewable pixels in a line and viewable lines in a

              frame is determined by the display resolution chosen through the

              RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

              the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

              chosen display resolution is used or not the areas of the monitor screen to the

              top bottom left and right of the viewable frame area are blanked by putting

              black pixels at the required line-pixel positions This has the effect of centering

              the image on the screen

              9

              The color generated for a pixel in the Pixel Unit depends on whether

              the particular pixel requires to be blanked or not The Synchronization Unit

              provides a signal to the Pixel Unit for this very reason This is the line display

              enable signal - en The signal is checked on each rising edge of the external

              clock signal (CLK) and is set as follows

              If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

              en = 0 (pixel requires to be blanked ndash set color to be black)

              Else

              en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

              6 SIGNALS IN VGA CONTROLLER

              For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

              timings are

              Parameter Value Unit

              Clock frequency 25175 MHz

              Horizontal pixels 640

              Horizontal sync polarity Negative

              Total time for each line 3177 micros

              Front porch (A) 094 micros

              Sync pulse length (B) 377 micros

              Back porch (C) 189 micros

              Active video (D) 2517 micros

              (Total horizontal sync time 660 micros)

              10

              61 The vertical timings are

              Parameter Value Unit

              Vertical lines 480

              Vertical sync polarity Negative

              Vertical frequency 5994 Hz

              Front porch (E) 035 ms

              Sync pulse length (F) 006 ms

              Back porch (G) 102 ms

              Active video (H) 1525 ms

              (Total vertical sync time 143 ms)

              640 x 400 70 Hz is video mode used for booting most x86 personal

              computers

              640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

              It should be noted that the actual timings vary slightly For example for

              640x480 60fps a 2517 micros active video time with a pixel frequency of

              25174 MHz gives 633 pixels rather than the expected 640 pixels

              62 Generation of the horizontal synchronization signal ndash HSYNC

              11

              The HSYNC signal is High (inactive) after an external reset signal

              (RST) is received by the VGA Controller The signal is updated on each rising

              edge of the external clock signal (CLK)

              The state of the HSYNC signal depends on the value stored in the horizontal

              counter and is driven low when

              HCOUNT ge (PixTotal + BlankingLeft)

              and remains low while

              HCOUNT lt (PixTotal + BlankingLeft + TLSync)

              Vertical (Frame) Period

              63 Generation of the vertical synchronization signal - VSYNC

              The VSYNC signal is High (inactive) after an external reset signal

              (RST) is received by the VGA Controller The signal is updated after every

              line of pixels is completed (ie on each rising edge of the HSYNC signal)

              The state of the VSYNC signal depends on the value stored in the vertical

              counter and is driven low when

              VCOUNT ge (LinTotal + BlankingLeft)

              and remains low while

              VCOUNT lt (LinTotal + BlankingLeft + TFSync)

              Address Counter (MEMADDR)

              The address counter is used to store the position of the next consecutive

              pixel in the frame Its value is passed to the Pixel Unit on the internal bus

              signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

              the next pixel from picture memory

              The counter is reset to zero when the VGA Controller receives an external

              reset signal (RST) The size of the counter depends on the values chosen for

              DISPSIZE_H and DISPSIZE_V as the range is simply

              12

              0 to (PixTotal x LinTotal) - 1

              Taking the maximum number of pixels in a line and lines in a frame for each

              of the supported display resolutions the maximum ranges for the counter are

              bull 640x480 0 to 307199

              bull 800x600 0 to 479999

              The counter has 19-bit resolution

              While the value in the horizontal counter (HCOUNT) is less than the

              total number of viewable pixels in a line (PixTotal the integer value of

              DISPSIZE_H) the counter is incremented on the rising edge of the external

              clock signal (CLK) Pixel addressing within the frame is consecutive When

              the counter reaches the last pixel in a line its incremented value is the first

              pixel in the next line down

              The address counter will continue to be incremented until the value in the

              vertical counter (VCOUNT) is greater than or equal to the total number of

              viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

              point it will be rolled over to zero

              13

              7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

              microcontroller and any VGA-compatible monitor This variant of the

              Controller provides six modes of display depending on the resolution chosen

              (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

              White 16 Colors or 64 Colors)

              71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

              For 1sec = 60 frames

              1 frame = max 160 sec

              1 pixel = 160 sec = 50ns

              307200

              Max Time for 1 pixel = 40ns

              2) To send information for each and every pixel of a frame by means of

              RGB signals

              3) After completely displaying 1 row information the column must be

              refreshed and the row has to be incremented ie done by Hsync signal

              4) After completing 1 frame of information a new frame has to be started

              for this row and column are to be refreshed ie done by Vsync signal

              14

              =gt For giving information for 1 pixel we use 1 clock (for controlling all

              operations)

              1 clock = 1 pixel

              1 clock cycle period = 40ns

              Clock frequency = 1time period =140ns = 25 mhz

              1 clock frequency = 25 mhz

              So if we use 25mhz clock and display 1 pixel information for each and enery

              clock then we use can attain min 60hz refreshing rate

              do not use the clock frequency less than 25mhz

              Ex

              15MHz 25MHz

              30MHz

              Time period = more effectively utilized frequency

              Less

              No of framessec = less

              More

              Decreases

              Increases

              (It is unable to display 60framessec)

              Signals used for image display-

              R (Red)

              15

              G (Green)

              B (Blue)

              HSYNC (Horizontal Synchronous signal)

              VSYNC (Vertical Synchronous signal)

              All the above signals are used to control the image data to be correctly

              displayed on the VGA display monitor

              72 Timings for front port sync pulse back port-

              Horizontal sync Vertical sync

              Front port 094micros Front port 035ms

              Sync pulse 377 micros Sync pulse 0063ms

              Back port 189 micros Back port 102ms

              16

              8 FUNCTIONAL DESCRIPTION81 Symbol

              VGA Controller Symbol

              Pin description

              The pin description is shown in the below table

              Table VGA Pin description

              Name Type Polarity Bus size Description

              Global Control Signals

              CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

              17

              640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

              RST I High Global system reset

              VGA Input Settings Signals

              RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

              CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

              DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

              DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

              Data Memory Control Signals

              RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

              18

              the internal line enable signal en generated by the Synchronization Unit of the Controller

              DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

              ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

              VGA Monitor Control Signals

              HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

              VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

              19

              VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

              R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

              G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

              S

              B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

              20

              9 VGA SIGNAL TIMING

              Figure 3 summarizes the signal timing involved in sending a line of

              pixels and a frame of lines The actual time values differ according to the

              resolution selected (640x480 or 800x600) the processor-defined values for

              DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

              (CLK)

              21

              91 VGA PIXEL UNIT

              The Pixel Unit provides access to the pixilated image stored in external

              picture memory reading in data a byte at a time and formatting each pixel to

              be displayed For each pixel the 6-bit RGB value required for the control of

              the monitorrsquos three primary color electron guns is generated so that the pixel

              is displayed on the screen with the correct color

              Accessing the stored image

              The image to be displayed on the monitor screen is written by the host

              microcontroller into external memory space (RAM) This memory space can

              be located anywhere (eg using a block of RAM within the FPGA design or

              using the SRAM on the NanoBoard itself)

              Pixel data is stored in the memory space in bytes The number of pixels in a

              byte depends on which color palette is being used for the image (selected by

              the CMOD input)

              bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

              bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

              bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

              (DATA[76] are not used)

              22

              The size of memory required to store a picture is determined by the total

              number of viewable pixels in a line (determined by DISPSIZE_H) the total

              number of viewable lines in a frame (determined by DISPSIZE_V) and the

              number of pixels stored in each byte in memory space

              Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

              The address in RAM where the next pixel is stored is determined using

              an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

              reflects the current contents of the MEMADDR register The exact addressing

              is described below

              92 Black amp White mode

              The picture memory address ndash the byte of data containing the next 8 pixels

              ndash is determined by using bits 183 of hvcnt and right shifting the contents by

              three

              ADDR_PIXEL = 00 amp hvcnt[183]

              16 Colors mode

              The picture memory address ndash the byte of data containing the next 2 pixels

              ndash is determined by using bits 181 of hvcnt and right shifting the contents by

              one

              ADDR_PIXEL = 0 amp hvcnt[181]

              64 Colors mode

              The picture memory address ndash the byte of data containing the next pixel ndash is

              determined by using the full value of hvcnt

              ADDR_PIXEL = hvcnt[180]

              The Pixel register (PIXREG)

              The Pixel register is used to receive the byte of pixel data read from the

              current address in memory space The register PIXREG is reset to zero when

              the VGA Controller receives an external reset signal (RST)

              23

              The register is updated on each rising edge of the CLK signal Data can be

              read from the memory space as long as the RD signal is active (High) The RD

              signal is itself controlled by the external line display enable signal enex This

              internally generated signal is defined as follows

              If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

              (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

              Else

              enex = 1

              When enex is Low read access from memory is disabled (RD = 0)

              The point at which data is loaded from memory into PIXREG depends on the

              particular color palette that is chosen ndash Black amp White 16 Colors or 64

              Colors

              Black amp White mode

              The next byte of data will be loaded into the register whenever the

              lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

              are ldquo000rdquo

              For the currently loaded byte the active pixel is always in the lowest bit

              position of the Pixel register Each pixel in the data byte is moved into this

              active pixel position by shifting the contents of the register right by one bit on

              each rising edge of CLK

              16 Colors mode

              The next byte of data will be loaded into the register whenever the

              lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

              0

              For the currently loaded byte the active pixel is always in the low order

              nibble of the Pixel register Remember that in this mode each byte of data

              contains two pixels The second pixel is moved into this active pixel position

              by shifting the contents of the register right by four bits on the rising edge of

              CLK

              24

              64 Colors mode

              The next byte of data will be loaded into the register on the rising edge

              of the external system clock signal (CLK) In this mode the read of pixel data

              does not depend on the status of the hvcnt signal received from the

              Synchronization Unit

              93 The RGB register (RGBREG)

              The RGB register is used to store the six bits that are required for

              driving the red green and blue color guns of the target monitor When the

              chosen color palette is either Black amp White or 16 Colors these six bits are

              obtained by mapping the value of the active pixel to a predefined RGB code

              When the chosen color palette is 64 Colors the actual pixel value is used

              directly

              The register RGBREG is reset to zero (000000) when the VGA

              Controller receives an external reset signal (RST) This RGB code represents

              black

              The register is updated on each rising edge of the CLK signal and the

              value loaded is dependent on the state of the line display enable signal en

              When en is Low blanking is required and RGBREG is loaded with the code

              for black (000000)

              Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

              and a 4-bit pixel (16 Colors mode) into the required RGB color code

              25

              26

              27

              The RGB color code stored in the RGB register is output from the VGA

              Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

              and B1)

              The monitor itself expects analog signals as inputs to its electron gun control

              circuits This is achieved by using 2-bit digital to analog converter circuitry

              located on the Nano Board itself as shown in Figure 4

              Figure 91 Digital to Analog RGB conversion

              28

              For each color the 2-bit digital signal from the VGA Controller can be

              converted into 4 distinct analog levels These levels specify the intensity of

              each of the three primary colors to use when displaying the pixel on the

              monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

              brightness)

              With each analog input being one of four possible levels the monitor

              can display each pixel on the screen with one of 64 different color

              permutations

              Using Block RAM in Spartan-3E Generation

              FPGAs

              For applications requiring large on-chip memories Spartantrade-3

              Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

              Using various configuration options Select RAM blocks create RAM ROM

              FIFOs large look-up tables data width converterscircular buffers and shift

              registers each supporting various data widths and depths This application note

              describes the features and capabilities of block Select RAM and illustrates how

              to specify the various options using the Xilinx CORE Generatortrade system or

              via VHDL or Verilog instantiation

              29

              Table 2 Block RAM Available in Spartan-3E Devices

              Devi

              ce

              R

              A

              M

              Co

              lu

              m

              ns

              R

              A

              M

              Bl

              oc

              ks

              pe

              r

              C

              ol

              u

              m

              n

              T

              o

              t

              a

              l

              R

              A

              M

              B

              l

              o

              c

              k

              s

              T

              o

              t

              a

              l

              R

              A

              M

              B

              it

              s

              T

              o

              t

              a

              l

              R

              A

              M

              K

              b

              i

              t

              s

              XC3

              S100

              E

              1 4 4 7

              3

              7

              2

              8

              7

              2

              XC3

              S250

              E

              2 6 1

              2

              2

              2

              1

              1

              8

              4

              2

              1

              6

              XC3

              S500

              E

              2 10 2

              0

              3

              6

              8

              6

              4

              3

              6

              0

              30

              0

              XC3

              S120

              0E

              2 14 2

              8

              5

              1

              6

              0

              9

              6

              5

              0

              4

              XC3

              S160

              0E

              2 18 3

              6

              6

              6

              3

              5

              5

              2

              6

              4

              8

              Each block RAM contains 18432 bits of fast static RAM 16K bits of

              which is allocated to data storage and in some memory configurations an

              additional 2K bits allocated to parity or additional plus data bits Physically

              the block RAM memory has two completely independent access ports labeled

              Port A and Port B The structure is fully symmetrical and both ports are

              interchangeable and both ports support data read and write operations Each

              memory port is synchronous with its own clock clock enable and write

              enable Read operations are also synchronous and require a clock edge and

              clock enable Though physically a dual-port memory block RAM simulates

              single-port memory in an application as shown in Figure 1 Furthermore each

              block memory supports multiple configurations or aspect ratios Table 3

              summarizes the essential SelectRAM features Cascade multiple block RAMs

              to create deeper and wider memory organizations with a minimal timing

              penalty incurred through specialized routing resources

              31

              32

              10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

              Why (V) HDL

              Interoperability

              Technology independence

              Design reuse

              Several levels of abstraction

              Readability

              Standard language

              Widely supported

              What is VHDL

              VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

              IC)

              Design specification language

              Design entry language

              Design simulation language

              Design documentation language

              An alternative to schematics

              101 Brief History

              VHDL Was developed in the early 1980s for managing design problems that

              involved large circuits and multiple teams of engineers

              Funded by US Department of Defence

              33

              The first publicly available version was released in 1985

              In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

              presented with a proposal to standardize the VHDL

              In 1987 standardization =gt IEEE 1076-1987

              An improved version of the language was released in 1994 =gt IEEE

              standard1076-1993

              102 Related Standards

              IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

              impedance

              Soon after IEEE 1076-1987 was released simulator companies began using

              their own non-standard types =gt VHDL was becoming a nonstandard

              IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

              definitions for a nine-valued data type std_logic

              IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

              to actual hardware

              Defines eg two numeric types signed and unsigned

              VHDL Environment

              Design Units

              34

              Segments of VHDL code that can be compiled separately and stored in a

              library

              Entities

              A black box with interface definition

              Defines the inputsoutputs of a component (define pins)

              A way to represent modularity in VHDL

              Similar to symbol in schematic

              Entity declaration describes entity

              Eg

              entity Comparator is

              port (A B in std_logic_vector(7 downto0)

              EQ out std_logic)

              end Comparator

              Ports Provide channels of communication between the component and its

              environment

              Each port must have a name direction and a type

              An entity may have NO port declaration

              35

              Port directions In A value of a port can be read inside the component but cannot be assigned

              Multiple reads of port are allowed

              Out Assignments can be made to a port but data from a port cannot be read

              Multiple assignments are allowed

              In out Bi-directional assignments can be made and data can be read Multiple

              assignments are allowed

              Buffer An out port with read capability May have at most one assignment

              (are not recommended)

              Architectures Every entity has at least one architecture

              One entity can have several architectures

              Architectures can describe design using

              BehaviorndashStructurendashDataflow

              Architectures can describe design on many levelsndashGate levelndashRTL (Register

              Transfer Level)ndashBehavioral level

              Configuration declaration links architecture to entity

              Eg

              Architecture Comparator1 of Comparator is

              Begin

              EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

              End Comparator1

              Configurations Links entity declaration and architecture body together

              Concept of default configuration is a bit messy in VHDL lsquo87

              ndashLast architecture analyzed links to entity

              Can be used to change simulation behavior without re-analyzing the VHDL

              source

              Complex configuration declarations are ignored in synthesis

              36

              Some entities can have eggate level architecture and behavioral

              architecture

              Are always optional

              PackagesPackages contain information common to many design units

              1 Package declaration

              --constant declarations

              ndashtype and subtype declarations

              ndashfunction and procedure declarations

              ndashglobal signal declarations

              ndashfile declarations

              ndashcomponent declarations

              2 Package body

              ndashis not necessary needed

              ndashfunction bodies

              ndashprocedure bodies

              Packages are meant for encapsuling data which can be shared globally among

              Several design units

              Consists of declaration part and optional body part

              Package declaration can contain

              ndashtype and subtype declarations

              ndashsubprograms

              ndashconstants

              Alias declarations

              ndashglobal signal declarations

              ndashfile declarations

              ndashcomponent declarations

              37

              Package body consists of

              ndashsubprogram declarations and bodies

              ndashtype and subtype declarations

              ndash deferred constants

              ndash file declarations

              LibrariesCollection of VHDL design units (database)

              1 Packages

              package declaration

              package body

              2 Entities (entity declaration)

              3 Architectures (architecture body)

              4 Configurations (configuration declarations)

              Usually directory in UNIX file system

              Can be also any other kind of database

              Levels of Abstraction

              VHDL supports many possible styles of design description which differ

              primarily in how closely they relate to the HW

              It is possible to describe a circuit in a number of ways

              Structural-------

              Dataflow ------- Higher level of abstraction

              Behavioral -------

              Structural VHDL description

              Circuit is described in terms of its components

              From a low-level description (eg transistor-level description) to a high level

              description (eg block diagram)

              For large circuits a low-level description quickly becomes impractical

              103 Dataflow VHDL Description

              38

              Circuit is described in terms of how data moves through the system

              In the dataflow style you describe how information flows between registers in

              the system

              The combinational logic is described at a relatively high level the placement

              and operation of registers is specified quite precisely

              The behavior of the system over the time is defined by registers

              There are no build-in registers in VHDL-language

              ndashEither lower level description

              ndashor behavioral description of sequential elements is needed

              The lower level register descriptions must be created or obtained

              If there is no 3rd party models for registers =gt you must write the behavioral

              description of registers

              The behavioral description can be provided in the form of

              subprograms(functions or procedures)

              104 Behavioral VHDL Description

              Circuit is described in terms of its operation over time

              39

              Representation might include eg state diagrams timing diagrams and

              algorithmic descriptions

              The concept of time may be expressed precisely using delays (eg A lt= B

              after 10 ns)

              If no actual delay is used order of sequential operations is defined

              In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

              timing specifications

              The actual timing results depend on implementation technology and efficiency

              of synthesis tool

              There are a few tools for behavioral synthesis

              Concurrent Vs Sequential

              Processes

              Basic simulation concept in VHDL

              VHDL description can always be broken up to interconnected processes

              Quite similar to Unix process

              40

              Process keyword in VHDL

              Process statement is concurrent statement

              Statements inside process statements are sequential statements

              Process must contain either sensitivity list or wait statement(s) but NOT both

              Sensitivity list or wait statement(s) contains signals which wakes process up

              General format

              Process [(sensitivity_list)]

              process_declarative_part

              Begin

              process_statements

              [wait_statement]

              End process

              41

              11 VGA CONTROLLER CODE

              library IEEE

              use IEEESTD_LOGIC_1164ALL

              use IEEESTD_LOGIC_ARITHALL

              use IEEESTD_LOGIC_UNSIGNEDALL

              use IEEEnumeric_stdALL

              ---- Uncomment the following library declaration if instantiating

              ---- any Xilinx primitives in this code

              --library UNISIM

              --use UNISIMVComponentsall

              entity vga_controller is port(clk50min std_logic

              rstnin std_logic

              hsyncout std_logic

              vsyncout std_logic

              redout std_logic_vector(0 downto 0)

              greenout std_logic_vector(0 downto 0)

              blueout std_logic_vector(0 downto 0))

              end vga_controller

              architecture Behavioral of vga_controller is

              component memory

              port (

              addr IN std_logic_VECTOR(17 downto 0)

              clk IN std_logic

              dout OUT std_logic_VECTOR(0 downto 0)

              en IN std_logic

              sinit IN std_logic)

              END component

              42

              component sync_unit port(

              clkin std_logic

              rstnin std_logic

              hsyncout std_logic

              ---horizontal synch pulse(row)

              vsyncout std_logic

              --vertical sync pulse(frame) 111110100000000000

              v_en_regout std_logic

              h_enout std_logic

              v_enout std_logic

              line_countinout integer

              c_hsinout integer)

              end component

              component pixelgen port(

              clkin std_logic

              rstnin std_logic

              hsyncin std_logic

              vsyncin std_logic

              h_enin std_logic

              v_enin std_logic

              v_en_reg in std_logic

              data_inin std_logic_vector(0 downto 0)

              line_countin integer

              c_hsin integer

              red out std_logic_vector(0 downto 0)

              greenout std_logic_vector(0 downto 0)

              blueout std_logic_vector(0 downto 0)

              addrout std_logic_vector(17 downto 0)

              rdout std_logic)

              end component

              43

              signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

              signal line_count_sc_hs_saddrinteger

              signal data_sstd_logic_vector(0 downto 0)

              signal addr_sstd_logic_vector(17 downto 0)

              begin

              --addr_s lt= std_logic_vector(to_unsigned(addr18))

              process(clk50mrstn)

              begin

              if(rstn = 0)then

              clk lt= 0

              elsif(clk50m = 1 and clk50mevent)then

              clk lt= not clk

              end if

              --end if

              end process

              hsync lt= hsync_s

              vsync lt= vsync_s

              sync sync_unit port map(

              clk =gt clk

              rstn =gt rstn

              hsync =gt hsync_s

              vsync =gt vsync_s

              v_en_reg =gt v_en_reg_s

              line_count =gt line_count_s

              h_en =gt h_en_s

              v_en =gt v_en_s

              c_hs =gt c_hs_s

              )

              44

              pixel pixelgen port map(

              clk =gt clk

              rstn =gt rstn

              hsync =gt hsync_s

              vsync =gt vsync_s

              h_en =gt h_en_s

              v_en =gt v_en_s

              v_en_reg =gt v_en_reg_s

              data_in =gt data_s

              line_count=gt line_count_s

              c_hs =gt c_hs_s

              red =gt red

              green =gt green

              blue =gt blue

              addr =gt addr_s

              rd =gt rd_s

              )

              mem12 memory port map(

              addr =gt addr_s

              clk =gt clk

              dout =gt data_s

              en =gt rd_s

              sinit=gt rstn

              )

              end Behavioral

              45

              SYNCHRONIZATION BLOCK CODE

              library IEEE

              use IEEESTD_LOGIC_1164ALL

              use IEEESTD_LOGIC_ARITHALL

              use IEEESTD_LOGIC_UNSIGNEDALL

              entity sync_unit is port(

              clkin std_logic

              rstnin std_logic

              hsyncout std_logic ---horizontal synch pulse(row)

              vsyncout std_logic --vertical sync

              pulse(frame) 111110100000000000

              v_en_regout std_logic

              h_enout std_logic

              v_enout std_logic

              line_countout integer

              c_hsout integer )

              end sync_unit

              architecture arch_sync_unit of sync_unit is

              signal h_en_sv_en_sstd_logic

              signal line_count_sc_hs_sinteger

              begin

              c_hs lt= c_hs_s

              line_count lt= line_count_s

              h_en lt= h_en_s

              v_en lt= v_en_s

              -- hsync counter

              process(clkrstn)

              begin

              if(rstn = 0)then

              46

              c_hs_s lt= 0

              else

              if(clk = 1 and clkevent)then

              if(c_hs_s=793)then

              c_hs_s lt= 0

              else

              c_hs_s lt= c_hs_s +1

              end if

              end if

              end if

              end process

              --vsync counter

              --vertical line counter

              process(h_en_srstn)

              begin

              if(rstn = 0)then

              line_count_s lt= 0

              elsif(h_en_s = 1 and h_en_sevent)then

              line_count_s lt= line_count_s + 1

              if (line_count_s=515)then

              line_count_s lt= 0

              end if

              end if

              --end if

              end process

              --hysnc pulse

              process(clkrstn)

              begin

              if(rstn = 0)then

              47

              hsync lt= 1

              elsif(clk = 1 and clkevent)then

              if(c_hs_s lt=95)then

              hsync lt= 0

              else

              hsync lt= 1

              end if

              end if

              --end if

              end process

              process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

              this period)

              begin

              if(rstn = 0)then

              h_en_s lt= 0

              elsif(clk = 1 and clkevent)then

              if((c_hs_s gt=143) and (c_hs_slt=778))then

              h_en_s lt= 1

              else

              h_en_s lt= 0

              end if

              end if

              --end if

              end process

              ---vysnc pulse

              process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

              begin

              if(rstn = 0)then

              48

              v_en_s lt= 0

              elsif(clk = 1 and clkevent)then

              if((line_count_sgt=35 and line_count_slt=515))then

              v_en_s lt= 1

              else

              v_en_s lt= 0

              end if

              end if

              --end if

              end process

              process(clkrstn) --Vertical Sync Front porch Back Porch

              begin

              if(rstn = 0)then

              vsync lt= 1

              elsif(clk = 1 and clkevent)then

              if(line_count_s lt2)then

              vsync lt= 0

              else

              vsync lt= 1

              end if

              end if

              --end if

              end process

              process(clk)

              begin

              if(clk = 1 and clkevent)then

              v_en_reglt=v_en_s

              end if

              end process

              end arch_sync_unit

              49

              PIXEL GENERATOR

              library IEEE

              use IEEESTD_LOGIC_1164ALL

              use IEEESTD_LOGIC_ARITHALL

              use IEEESTD_LOGIC_UNSIGNEDALL

              entity pixelgen is port(

              clkin std_logic

              rstnin std_logic

              hsyncin std_logic

              vsyncin std_logic

              h_enin std_logic

              v_enin std_logic

              v_en_reg in std_logic

              data_inin std_logic_vector(0 downto 0)

              line_countin integer

              c_hsin integer

              red out std_logic_vector(0 downto 0)

              greenout std_logic_vector(0 downto 0)

              blueout std_logic_vector(0 downto 0)

              addrout std_logic_vector(17 downto 0)

              rdout std_logic

              )

              end pixelgen

              architecture behv of pixelgen is

              signal addr_sstd_logic_vector(17 downto 0)

              begin

              addr lt= addr_s

              process(clkrstn)

              begin

              50

              if(rstn = 0)then

              rd lt= 0

              red lt= 0

              green lt= 0

              blue lt= 0

              addr_s lt= x0000 amp 00

              elsif (clk = 1 and clkevent)then

              if((v_en=1) and (v_en_reg=0))then

              addr_slt=(others =gt0)

              end if

              if(v_en=1 and h_en=1)then

              if(line_countlt=514)then

              if(c_hs lt= 655)then

              rd lt= 1

              red lt= data_in

              green lt= data_in

              blue lt= data_in

              addr_s lt=addr_s +000000000000000001

              else

              rd lt= 0

              red lt= 0

              green lt= 1

              blue lt= 0

              end if

              else

              rd lt= 0

              red lt= 1

              green lt= 1

              blue lt= 1

              51

              end if

              if(addr_s=111011111111111111)then

              addr_slt= (others =gt0)

              end if

              else

              red lt= 0

              green lt= 0

              blue lt= 0

              end if

              end if

              -- end if

              --end if

              end process

              end behv

              52

              12 FPGA ARCHITECTURE

              A field-programmable gate array (FPGA) is a semiconductor device that can

              be configured by the customer or designer after manufacturingmdashhence the

              name field-programmable FPGAs are programmed using a logic circuit

              diagram or a source code in a hardware description language (HDL) to specify

              how the chip will work They can be used to implement any logical function

              that an application-specific integrated circuit (ASIC) could perform but the

              ability to update the functionality after shipping offers advantages for many

              applications

              FPGAs contain programmable logic components called logic blocks and a

              hierarchy of reconfigurable interconnects that allow the blocks to be wired

              togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

              can be configured to perform complex combinational functions or merely

              simple logic gates like AND and XOR In most FPGAs the logic blocks also

              include memory elements which may be simple flip-flops or more complete

              blocks of memory

              History The FPGA industry sprouted from programmable read only

              memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

              both had the option of being programmed in batches in a factory or in the field

              (field programmable) however programmable logic was hard-wired between

              logic gates

              Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

              first commercially viable field programmable gate array in 1985 ndash the

              XC2064 The XC2064 had programmable gates and programmable

              interconnects between gates the beginnings of a new technology and market

              The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

              input lookup tables (LUTs) More than 20 years later Freeman was entered

              into the National Inventors Hall of Fame for his invention

              53

              Some of the industryrsquos foundational concepts and technologies for

              programmable logic arrays gates and logic blocks are founded in patents

              awarded to David W Page and LuVerne R Peterson in 1985

              In the late 1980s the Naval Surface Warfare Department funded an experiment

              proposed by Steve Casselman to develop a computer that would implement

              600000 reprogrammable gates Casselman was successful and the system was

              awarded a patent in 1992

              Xilinx continued unchallenged and quickly growing from 1985 to the mid-

              1990s when competitors sprouted up eroding significant market-share By

              1993 Actel was serving about 18 percent of the market

              The 1990s were an explosive period of time for FPGAs both in sophistication

              and the volume of production In the early 1990s FPGAs were primarily used

              in telecommunications and networking By the end of the decade FPGAs

              found their way into consumer automotive and industrial applications

              FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

              algorithm technology and FPGAs to create a sound recognition device

              Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

              to decide the configuration needed to accomplish a sound recognition task

              121 Modern developments

              A recent trend has been to take the coarse-grained architectural

              approach a step further by combining the logic blocks and interconnects of

              traditional FPGAs with embedded microprocessors and related peripherals to

              form a complete system on a programmable chip This work mirrors the

              architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

              Group which combined a reconfigurable CPU architecture on a single chip

              called the SB24 That work was done in 1982 Examples of such hybrid

              technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

              which include one or more PowerPC processors embedded within the FPGAs

              54

              logic fabric The Atmel FPSLIC is another such device which uses an AVR

              processor in combination with Atmels programmable logic architecture

              An alternate approach to using hard-macro processors is to make use of soft

              processor cores that are implemented within the FPGA logic (See Soft

              processors below)

              As previously mentioned many modern FPGAs have the ability to be

              reprogrammed at run time and this is leading to the idea of reconfigurable

              computing or reconfigurable systems mdash CPUs that reconfigure themselves to

              suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

              example of a reconfigurable soft processor implemented on FPGAs However

              it does not support dynamic reconfiguration at runtime but instead adapts itself

              to a specific program

              Additionally new non-FPGA architectures are beginning to emerge

              Software-configurable microprocessors such as the Stretch S5000 adopt a

              hybrid approach by providing an array of processor cores and FPGA-like

              programmable cores on the same chip

              Gates

              1987 9000 gates Xilinx

              1992 600000 Naval Surface Warfare Department

              Early 2000s Millions

              Market size

              1985 First commercial FPGA technology invented by Xilinx

              1987 $14 million

              ~1993 gt$385 million

              2005 $19 billion

              2010 estimates $275 billion

              55

              122 FPGA Comparisons

              Historically FPGAs have been slower less energy efficient and generally

              achieved less functionality than their fixed ASIC counterparts A combination

              of volume fabrication improvements research and development and the IO

              capabilities of new supercomputers have largely closed the performance gap

              between ASICs and FPGAs

              Advantages include a shorter time to market ability to re-program in the field

              to fix bugs and lower non-recurring engineering costs Vendors can also take a

              middle road by developing their hardware on ordinary FPGAs but

              manufacture their final version so it can no longer be modified after the design

              has been committed

              Xilinx claims that several market and technology dynamics are changing the

              ASICFPGA paradigm

              IC costs are rising aggressively

              ASIC complexity has bolstered development time and costs

              RampD resources and headcount is decreasing

              Revenue losses for slow time-to-market are increasing

              Financial constraints in a poor economy are driving low-cost technologies

              These trends make FPGAs a better alternative than ASICs for a growing

              number of higher-volume applications than they have been historically used

              for which the company blames for the growing number of FPGA design starts

              (see History)

              The primary differences between CPLDs and FPGAs are architectural A

              CPLD has a somewhat restrictive structure consisting of one or more

              programmable sum-of-products logic arrays feeding a relatively small number

              of clocked registers The result of this is less flexibility with the advantage of

              more predictable timing delays and a higher logic-to-interconnect ratio The

              FPGA architectures on the other hand are dominated by interconnect This

              56

              makes them far more flexible (in terms of the range of designs that are

              practical for implementation within them) but also far more complex to design

              for

              Another notable difference between CPLDs and FPGAs is the presence in most

              FPGAs of higher-level embedded functions (such as adders and multipliers)

              and embedded memories as well as to have logic blocks implement decoders

              or mathematical functions

              Some FPGAs have the capability of partial re-configuration that lets one

              portion of the device be re-programmed while other portions continue running

              123 Applications

              Applications of FPGAs include digital signal processing software-defined

              radio aerospace and defense systems ASIC prototyping medical imaging

              computer vision speech recognition cryptography bioinformatics computer

              hardware emulation radio astronomy and a growing range of other areas

              FPGAs originally began as competitors to CPLDs and competed in a similar

              space that of glue logic for PCBs As their size capabilities and speed

              increased they began to take over larger and larger functions to the state where

              some are now marketed as full systems on chips (SoC) Particularly with the

              introduction of dedicated multipliers into FPGA architectures in the late 1990s

              applications which had traditionally been the sole reserve of DSPs began to

              incorporate FPGAs instead

              FPGAs especially find applications in any area or algorithm that can make use

              of the massive parallelism offered by their architecture One such area is code

              breaking in particular brute-force attack of cryptographic algorithms

              FPGAs are increasingly used in conventional high performance computing

              applications where computational kernels such as FFT or Convolution are

              performed on the FPGA instead of a microprocessor

              57

              The inherent parallelism of the logic resources on an FPGA allows for

              considerable computational throughput even at a low MHz clock rates The

              flexibility of the FPGA allows for even higher performance by trading off

              precision and range in the number format for an increased number of parallel

              arithmetic units This has driven a new type of processing called

              reconfigurable computing where time intensive tasks are offloaded from

              software to FPGAs

              The adoption of FPGAs in high performance computing is currently limited by

              the complexity of FPGA design compared to conventional software and the

              extremely long turn-around times of current design tools where 4-8 hours wait

              is necessary after even minor changes to the source code

              Traditionally FPGAs have been reserved for specific vertical applications

              where the volume of production is small For these low-volume applications

              the premium that companies pay in hardware costs per unit for a

              programmable chip is more affordable than the development resources spent

              on creating an ASIC for a low-volume application Today new cost and

              performance dynamics have broadened the range of viable applications

              Architecture

              The most common FPGA architecture consists of an array of configurable

              logic blocks (CLBs) IO pads and routing channels Generally all the routing

              channels have the same width (number of wires) Multiple IO pads may fit

              into the height of one row or the width of one column in the array

              An application circuit must be mapped into an FPGA with adequate resources

              While the number of CLBs and IOs required is easily determined from the

              design the number of routing tracks needed may vary considerably even

              among designs with the same amount of logic (For example a crossbar switch

              requires much more routing than a systolic array with the same gate count)

              Since unused routing tracks increase the cost (and decrease the performance)

              of the part without providing any benefit FPGA manufacturers try to provide

              58

              just enough tracks so that most designs that will fit in terms of LUTs and IOs

              can be routed This is determined by estimates such as those derived from

              Rents rule or by experiments with existing designs

              The FPGA is an array or island-style FPGA It consists of an array of logic

              blocks and routing channels Two IO pads fit into the height of one row or the

              width of one column as shown below All the routing channels have the same

              width (number of wires)

              121 FPGA structure

              A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

              flip-flop as shown below In recent years manufacturers have started moving

              to 6-input LUTs in their high performance parts claiming increased

              performance

              Typical logic block

              59

              There is only one output which can be either the registered or the unregistered

              LUT output The logic block has four inputs for the LUT and a clock input

              Since clock signals (and often other high-fanout signals) are normally routed

              via special-purpose dedicated routing networks in commercial FPGAs they

              and other signals are separately managed

              For this example architecture the locations of the FPGA logic block pins are

              shown below

              Logic Block Pin Locations

              Each input is accessible from one side of the logic block while the output pin

              can connect to routing wires in both the channel to the right and the channel

              below the logic block

              Each logic block output pin can connect to any of the wiring segments in the

              channels adjacent to it

              Similarly an IO pad can connect to any one of the wiring segments in the

              channel adjacent to it For example an IO pad at the top of the chip can

              connect to any of the W wires (where W is the channel width) in the horizontal

              channel immediately below it

              Generally the FPGA routing is unsegmented That is each wiring segment

              spans only one logic block before it terminates in a switch box By turning on

              60

              some of the programmable switches within a switch box longer paths can be

              constructed For higher speed interconnect some FPGA architectures use

              longer routing lines that span multiple logic blocks

              Whenever a vertical and a horizontal channel intersect there is a switch box In

              this architecture when a wire enters a switch box there are three

              programmable switches that allow it to connect to three other wires in adjacent

              channel segments The pattern or topology of switches used in this

              architecture is the planar or domain-based switch box topology In this switch

              box topology a wire in track number one connects only to wires in track

              number one in adjacent channel segments wires in track number 2 connect

              only to other wires in track number 2 and so on The figure below illustrates

              the connections in a switch box

              Switch box topology

              Modern FPGA families expand upon the above capabilities to include higher

              level functionality fixed into the silicon Having these common functions

              embedded into the silicon reduces the area required and gives those functions

              increased speed compared to building them from primitives Examples of these

              include multipliers generic DSP blocks embedded processors high speed IO

              logic and embedded memories

              61

              FPGAs are also widely used for systems validation including pre-silicon

              validation post-silicon validation and firmware development This allows chip

              companies to validate their design before the chip is produced in the factory

              reducing the time to market

              124 FPGA Design and Programming

              To define the behavior of the FPGA the user provides a hardware description

              language (HDL) or a schematic design The HDL form might be easier to work

              with when handling large structures because its possible to just specify them

              numerically rather than having to draw every piece by hand On the other hand

              schematic entry can allow for easier visualisation of a design

              Then using an electronic design automation tool a technology-mapped netlist

              is generated The netlist can then be fitted to the actual FPGA architecture

              using a process called place-and-route usually performed by the FPGA

              companys proprietary place-and-route software The user will validate the

              map place and route results via timing analysis simulation and other

              verification methodologies Once the design and validation process is

              complete the binary file generated (also using the FPGA companys

              proprietary software) is used to (re)configure the FPGA

              Going from schematicHDL source files to actual configuration The source

              files are fed to a software suite from the FPGACPLD vendor that through

              different steps will produce a file This file is then transferred to the

              FPGACPLD via a serial interface (JTAG) or to an external memory device

              like an EEPROM

              The most common HDLs are VHDL and Verilog although in an attempt to

              reduce the complexity of designing in HDLs which have been compared to the

              equivalent of assembly languages there are moves to raise the abstraction level

              through the introduction of alternative languages

              62

              To simplify the design of complex systems in FPGAs there exist libraries of

              predefined complex functions and circuits that have been tested and optimized

              to speed up the design process These predefined circuits are commonly called

              IP cores and are available from FPGA vendors and third-party IP suppliers

              (rarely free and typically released under proprietary licenses) Other

              predefined circuits are available from developer communities such as Open

              Cores (typically free and released under the GPL BSD or similar license) and

              other sources

              In a typical design flow an FPGA application developer will simulate the

              design at multiple stages throughout the design process Initially the RTL

              description in VHDL or Verilog is simulated by creating test benches to

              simulate the system and observe results Then after the synthesis engine has

              mapped the design to a netlist the netlist is translated to a gate level

              description where simulation is repeated to confirm the synthesis proceeded

              without errors Finally the design is laid out in the FPGA at which point

              propagation delays can be added and the simulation run again with these

              values back-annotated onto the netlist

              Basic Process Technology Types

              SRAM - based on static memory technology In-system programmable and re-

              programmable Requires external boot devices CMOS

              Antifuse - One-time programmable CMOS

              EPROM - Erasable Programmable Read-Only Memory technology Usually

              one-time programmable in production because of plastic packaging

              Windowed devices can be erased with ultraviolet (UV) light CMOS

              EEPROM - Electrically Erasable Programmable Read-Only Memory

              technology Can be erased even in plastic packages Some but not all

              EEPROM devices can be in-system programmed CMOS

              Flash - Flash-erase EPROM technology Can be erased even in plastic

              packages Some but not all flash devices can be in-system programmed

              63

              Usually a flash cell is smaller than an equivalent EEPROM cell and is

              therefore less expensive to manufacture CMOS

              Fuse - One-time programmable Bipolar

              Major Manufacturers

              Xilinx and Altera are the current FPGA market leaders and long-time industry

              rivals Together they control over 80 percent of the market with Xilinx alone

              representing over 50 percent

              Xilinx also provides free Windows and Linux design software while Altera

              provides free Windows tools the Solaris and Linux tools are only available via

              a rental scheme

              Other competitors include Lattice Semiconductor (flash SRAM) Actel

              (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

              Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

              focused CSSP no general purpose FPGAs)

              125 FPGA prototype

              FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

              prototyping is the method to prototype SoC and ASIC design on FPGA for

              hardware verification and early software development

              Main stream verification methods for hardware design and early software and

              firmware co-design has become mainstream Prototyping SoC and ASIC

              design on FPGA has become a good method to do this

              64

              Reasons why Prototyping is important

              1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

              functionally correct This is compared to designers only relying on software

              simulations to verify that their hardware design is sound Simulation speed and

              modeling accuracy limitations hinder this development

              2 Due to time constrains many projects cannot wait until the silicon is back from

              the foundry to start on software tests FPGA prototyping allows for much more

              time in area of software development and testing at the software-hardware

              integration stage This allows many unforeseen software bugs that appear due

              to todays array of operating systems applications and hardware

              3 Prototyping also allows the developer to ensure that all IP technologies on his

              system work well together off the simulation stage and in actual form

              4 Prototyping has the added advantage as demo platforms to SoC clients

              bringing in interest early This speeds up the overall development cycle and

              allows for more enhancement or improvement to the chip features as it would

              otherwise have been

              65

              13 SIMULATION RESULTS

              66

              GATE LEVEL

              FigVGA CONTROLLER

              67

              GATE LEVEL

              Fig SYNC UNIT

              68

              TECHNOLOGY SCHEMATIC

              69

              14 APPLICATIONS

              Motion Tracking Systems

              Image processing systems

              Displaying Systems

              141 CONCLUTION

              Hence by implementing FPGA taking a processor-generated picture

              (pixilated) from memory space the Controller provides digital RGB values for

              each pixel as well as horizontal and vertical synchronization signals in order

              to correctly display the picture on a connected monitor

              70

              REFERENCES

              1 httpenwikipediaorgwikiVideo_Graphics_Array

              2 A VGA display controller by Eduardo Sanchez

              3 HinnerrdquoVGA Interface and video signal documentsrdquo

              71

              • Pixel
                • 6 SIGNALS IN VGA CONTROLLER
                • Gates
                • Market size
                • 122 FPGA Comparisons

                5 Hardware Description

                51 Block Diagram

                51 VGA Synchronization Unit

                8

                The Synchronization Unit provides the horizontal and vertical

                synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

                display a picture frame within the confines of a monitorrsquos display area

                These synchronization signals are used as control inputs by the

                monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

                electrons emitted by the three primary color electron guns (Red Green Blue)

                left to right and from top to bottom respectively HSYNC provides the start

                and stop times for the horizontal deflection circuit so that a line of pixels is

                correctly drawn across the screen display VSYNC provides the start and stop

                times for the vertical deflection circuit so that the lines of a frame are correctly

                drawn from the top to the bottom of the screen display

                The resolution for the display is defined by the level on the

                RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

                the 800x600 resolution (SVGA) is used

                Although the resolution determines the area of a monitorrsquos screen

                within which an image can be displayed the full extents of the chosen

                resolution do not have to be used The actual extents of the image display area

                on the screen can be controlled by the use of the DISPSIZE_H and

                DISPSIZE_V inputs These inputs determine the total number of pixels to be

                used in a line and the total number of lines in a frame respectively

                52 Blank pixel generation

                The total number of viewable pixels in a line and viewable lines in a

                frame is determined by the display resolution chosen through the

                RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

                the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

                chosen display resolution is used or not the areas of the monitor screen to the

                top bottom left and right of the viewable frame area are blanked by putting

                black pixels at the required line-pixel positions This has the effect of centering

                the image on the screen

                9

                The color generated for a pixel in the Pixel Unit depends on whether

                the particular pixel requires to be blanked or not The Synchronization Unit

                provides a signal to the Pixel Unit for this very reason This is the line display

                enable signal - en The signal is checked on each rising edge of the external

                clock signal (CLK) and is set as follows

                If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

                en = 0 (pixel requires to be blanked ndash set color to be black)

                Else

                en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

                6 SIGNALS IN VGA CONTROLLER

                For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

                timings are

                Parameter Value Unit

                Clock frequency 25175 MHz

                Horizontal pixels 640

                Horizontal sync polarity Negative

                Total time for each line 3177 micros

                Front porch (A) 094 micros

                Sync pulse length (B) 377 micros

                Back porch (C) 189 micros

                Active video (D) 2517 micros

                (Total horizontal sync time 660 micros)

                10

                61 The vertical timings are

                Parameter Value Unit

                Vertical lines 480

                Vertical sync polarity Negative

                Vertical frequency 5994 Hz

                Front porch (E) 035 ms

                Sync pulse length (F) 006 ms

                Back porch (G) 102 ms

                Active video (H) 1525 ms

                (Total vertical sync time 143 ms)

                640 x 400 70 Hz is video mode used for booting most x86 personal

                computers

                640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

                It should be noted that the actual timings vary slightly For example for

                640x480 60fps a 2517 micros active video time with a pixel frequency of

                25174 MHz gives 633 pixels rather than the expected 640 pixels

                62 Generation of the horizontal synchronization signal ndash HSYNC

                11

                The HSYNC signal is High (inactive) after an external reset signal

                (RST) is received by the VGA Controller The signal is updated on each rising

                edge of the external clock signal (CLK)

                The state of the HSYNC signal depends on the value stored in the horizontal

                counter and is driven low when

                HCOUNT ge (PixTotal + BlankingLeft)

                and remains low while

                HCOUNT lt (PixTotal + BlankingLeft + TLSync)

                Vertical (Frame) Period

                63 Generation of the vertical synchronization signal - VSYNC

                The VSYNC signal is High (inactive) after an external reset signal

                (RST) is received by the VGA Controller The signal is updated after every

                line of pixels is completed (ie on each rising edge of the HSYNC signal)

                The state of the VSYNC signal depends on the value stored in the vertical

                counter and is driven low when

                VCOUNT ge (LinTotal + BlankingLeft)

                and remains low while

                VCOUNT lt (LinTotal + BlankingLeft + TFSync)

                Address Counter (MEMADDR)

                The address counter is used to store the position of the next consecutive

                pixel in the frame Its value is passed to the Pixel Unit on the internal bus

                signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

                the next pixel from picture memory

                The counter is reset to zero when the VGA Controller receives an external

                reset signal (RST) The size of the counter depends on the values chosen for

                DISPSIZE_H and DISPSIZE_V as the range is simply

                12

                0 to (PixTotal x LinTotal) - 1

                Taking the maximum number of pixels in a line and lines in a frame for each

                of the supported display resolutions the maximum ranges for the counter are

                bull 640x480 0 to 307199

                bull 800x600 0 to 479999

                The counter has 19-bit resolution

                While the value in the horizontal counter (HCOUNT) is less than the

                total number of viewable pixels in a line (PixTotal the integer value of

                DISPSIZE_H) the counter is incremented on the rising edge of the external

                clock signal (CLK) Pixel addressing within the frame is consecutive When

                the counter reaches the last pixel in a line its incremented value is the first

                pixel in the next line down

                The address counter will continue to be incremented until the value in the

                vertical counter (VCOUNT) is greater than or equal to the total number of

                viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

                point it will be rolled over to zero

                13

                7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

                microcontroller and any VGA-compatible monitor This variant of the

                Controller provides six modes of display depending on the resolution chosen

                (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

                White 16 Colors or 64 Colors)

                71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

                For 1sec = 60 frames

                1 frame = max 160 sec

                1 pixel = 160 sec = 50ns

                307200

                Max Time for 1 pixel = 40ns

                2) To send information for each and every pixel of a frame by means of

                RGB signals

                3) After completely displaying 1 row information the column must be

                refreshed and the row has to be incremented ie done by Hsync signal

                4) After completing 1 frame of information a new frame has to be started

                for this row and column are to be refreshed ie done by Vsync signal

                14

                =gt For giving information for 1 pixel we use 1 clock (for controlling all

                operations)

                1 clock = 1 pixel

                1 clock cycle period = 40ns

                Clock frequency = 1time period =140ns = 25 mhz

                1 clock frequency = 25 mhz

                So if we use 25mhz clock and display 1 pixel information for each and enery

                clock then we use can attain min 60hz refreshing rate

                do not use the clock frequency less than 25mhz

                Ex

                15MHz 25MHz

                30MHz

                Time period = more effectively utilized frequency

                Less

                No of framessec = less

                More

                Decreases

                Increases

                (It is unable to display 60framessec)

                Signals used for image display-

                R (Red)

                15

                G (Green)

                B (Blue)

                HSYNC (Horizontal Synchronous signal)

                VSYNC (Vertical Synchronous signal)

                All the above signals are used to control the image data to be correctly

                displayed on the VGA display monitor

                72 Timings for front port sync pulse back port-

                Horizontal sync Vertical sync

                Front port 094micros Front port 035ms

                Sync pulse 377 micros Sync pulse 0063ms

                Back port 189 micros Back port 102ms

                16

                8 FUNCTIONAL DESCRIPTION81 Symbol

                VGA Controller Symbol

                Pin description

                The pin description is shown in the below table

                Table VGA Pin description

                Name Type Polarity Bus size Description

                Global Control Signals

                CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                17

                640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                RST I High Global system reset

                VGA Input Settings Signals

                RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                Data Memory Control Signals

                RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                18

                the internal line enable signal en generated by the Synchronization Unit of the Controller

                DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                VGA Monitor Control Signals

                HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                19

                VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                S

                B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                20

                9 VGA SIGNAL TIMING

                Figure 3 summarizes the signal timing involved in sending a line of

                pixels and a frame of lines The actual time values differ according to the

                resolution selected (640x480 or 800x600) the processor-defined values for

                DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                (CLK)

                21

                91 VGA PIXEL UNIT

                The Pixel Unit provides access to the pixilated image stored in external

                picture memory reading in data a byte at a time and formatting each pixel to

                be displayed For each pixel the 6-bit RGB value required for the control of

                the monitorrsquos three primary color electron guns is generated so that the pixel

                is displayed on the screen with the correct color

                Accessing the stored image

                The image to be displayed on the monitor screen is written by the host

                microcontroller into external memory space (RAM) This memory space can

                be located anywhere (eg using a block of RAM within the FPGA design or

                using the SRAM on the NanoBoard itself)

                Pixel data is stored in the memory space in bytes The number of pixels in a

                byte depends on which color palette is being used for the image (selected by

                the CMOD input)

                bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                (DATA[76] are not used)

                22

                The size of memory required to store a picture is determined by the total

                number of viewable pixels in a line (determined by DISPSIZE_H) the total

                number of viewable lines in a frame (determined by DISPSIZE_V) and the

                number of pixels stored in each byte in memory space

                Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                The address in RAM where the next pixel is stored is determined using

                an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                reflects the current contents of the MEMADDR register The exact addressing

                is described below

                92 Black amp White mode

                The picture memory address ndash the byte of data containing the next 8 pixels

                ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                three

                ADDR_PIXEL = 00 amp hvcnt[183]

                16 Colors mode

                The picture memory address ndash the byte of data containing the next 2 pixels

                ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                one

                ADDR_PIXEL = 0 amp hvcnt[181]

                64 Colors mode

                The picture memory address ndash the byte of data containing the next pixel ndash is

                determined by using the full value of hvcnt

                ADDR_PIXEL = hvcnt[180]

                The Pixel register (PIXREG)

                The Pixel register is used to receive the byte of pixel data read from the

                current address in memory space The register PIXREG is reset to zero when

                the VGA Controller receives an external reset signal (RST)

                23

                The register is updated on each rising edge of the CLK signal Data can be

                read from the memory space as long as the RD signal is active (High) The RD

                signal is itself controlled by the external line display enable signal enex This

                internally generated signal is defined as follows

                If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                Else

                enex = 1

                When enex is Low read access from memory is disabled (RD = 0)

                The point at which data is loaded from memory into PIXREG depends on the

                particular color palette that is chosen ndash Black amp White 16 Colors or 64

                Colors

                Black amp White mode

                The next byte of data will be loaded into the register whenever the

                lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                are ldquo000rdquo

                For the currently loaded byte the active pixel is always in the lowest bit

                position of the Pixel register Each pixel in the data byte is moved into this

                active pixel position by shifting the contents of the register right by one bit on

                each rising edge of CLK

                16 Colors mode

                The next byte of data will be loaded into the register whenever the

                lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                0

                For the currently loaded byte the active pixel is always in the low order

                nibble of the Pixel register Remember that in this mode each byte of data

                contains two pixels The second pixel is moved into this active pixel position

                by shifting the contents of the register right by four bits on the rising edge of

                CLK

                24

                64 Colors mode

                The next byte of data will be loaded into the register on the rising edge

                of the external system clock signal (CLK) In this mode the read of pixel data

                does not depend on the status of the hvcnt signal received from the

                Synchronization Unit

                93 The RGB register (RGBREG)

                The RGB register is used to store the six bits that are required for

                driving the red green and blue color guns of the target monitor When the

                chosen color palette is either Black amp White or 16 Colors these six bits are

                obtained by mapping the value of the active pixel to a predefined RGB code

                When the chosen color palette is 64 Colors the actual pixel value is used

                directly

                The register RGBREG is reset to zero (000000) when the VGA

                Controller receives an external reset signal (RST) This RGB code represents

                black

                The register is updated on each rising edge of the CLK signal and the

                value loaded is dependent on the state of the line display enable signal en

                When en is Low blanking is required and RGBREG is loaded with the code

                for black (000000)

                Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                and a 4-bit pixel (16 Colors mode) into the required RGB color code

                25

                26

                27

                The RGB color code stored in the RGB register is output from the VGA

                Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                and B1)

                The monitor itself expects analog signals as inputs to its electron gun control

                circuits This is achieved by using 2-bit digital to analog converter circuitry

                located on the Nano Board itself as shown in Figure 4

                Figure 91 Digital to Analog RGB conversion

                28

                For each color the 2-bit digital signal from the VGA Controller can be

                converted into 4 distinct analog levels These levels specify the intensity of

                each of the three primary colors to use when displaying the pixel on the

                monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                brightness)

                With each analog input being one of four possible levels the monitor

                can display each pixel on the screen with one of 64 different color

                permutations

                Using Block RAM in Spartan-3E Generation

                FPGAs

                For applications requiring large on-chip memories Spartantrade-3

                Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                Using various configuration options Select RAM blocks create RAM ROM

                FIFOs large look-up tables data width converterscircular buffers and shift

                registers each supporting various data widths and depths This application note

                describes the features and capabilities of block Select RAM and illustrates how

                to specify the various options using the Xilinx CORE Generatortrade system or

                via VHDL or Verilog instantiation

                29

                Table 2 Block RAM Available in Spartan-3E Devices

                Devi

                ce

                R

                A

                M

                Co

                lu

                m

                ns

                R

                A

                M

                Bl

                oc

                ks

                pe

                r

                C

                ol

                u

                m

                n

                T

                o

                t

                a

                l

                R

                A

                M

                B

                l

                o

                c

                k

                s

                T

                o

                t

                a

                l

                R

                A

                M

                B

                it

                s

                T

                o

                t

                a

                l

                R

                A

                M

                K

                b

                i

                t

                s

                XC3

                S100

                E

                1 4 4 7

                3

                7

                2

                8

                7

                2

                XC3

                S250

                E

                2 6 1

                2

                2

                2

                1

                1

                8

                4

                2

                1

                6

                XC3

                S500

                E

                2 10 2

                0

                3

                6

                8

                6

                4

                3

                6

                0

                30

                0

                XC3

                S120

                0E

                2 14 2

                8

                5

                1

                6

                0

                9

                6

                5

                0

                4

                XC3

                S160

                0E

                2 18 3

                6

                6

                6

                3

                5

                5

                2

                6

                4

                8

                Each block RAM contains 18432 bits of fast static RAM 16K bits of

                which is allocated to data storage and in some memory configurations an

                additional 2K bits allocated to parity or additional plus data bits Physically

                the block RAM memory has two completely independent access ports labeled

                Port A and Port B The structure is fully symmetrical and both ports are

                interchangeable and both ports support data read and write operations Each

                memory port is synchronous with its own clock clock enable and write

                enable Read operations are also synchronous and require a clock edge and

                clock enable Though physically a dual-port memory block RAM simulates

                single-port memory in an application as shown in Figure 1 Furthermore each

                block memory supports multiple configurations or aspect ratios Table 3

                summarizes the essential SelectRAM features Cascade multiple block RAMs

                to create deeper and wider memory organizations with a minimal timing

                penalty incurred through specialized routing resources

                31

                32

                10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                Why (V) HDL

                Interoperability

                Technology independence

                Design reuse

                Several levels of abstraction

                Readability

                Standard language

                Widely supported

                What is VHDL

                VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                IC)

                Design specification language

                Design entry language

                Design simulation language

                Design documentation language

                An alternative to schematics

                101 Brief History

                VHDL Was developed in the early 1980s for managing design problems that

                involved large circuits and multiple teams of engineers

                Funded by US Department of Defence

                33

                The first publicly available version was released in 1985

                In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                presented with a proposal to standardize the VHDL

                In 1987 standardization =gt IEEE 1076-1987

                An improved version of the language was released in 1994 =gt IEEE

                standard1076-1993

                102 Related Standards

                IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                impedance

                Soon after IEEE 1076-1987 was released simulator companies began using

                their own non-standard types =gt VHDL was becoming a nonstandard

                IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                definitions for a nine-valued data type std_logic

                IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                to actual hardware

                Defines eg two numeric types signed and unsigned

                VHDL Environment

                Design Units

                34

                Segments of VHDL code that can be compiled separately and stored in a

                library

                Entities

                A black box with interface definition

                Defines the inputsoutputs of a component (define pins)

                A way to represent modularity in VHDL

                Similar to symbol in schematic

                Entity declaration describes entity

                Eg

                entity Comparator is

                port (A B in std_logic_vector(7 downto0)

                EQ out std_logic)

                end Comparator

                Ports Provide channels of communication between the component and its

                environment

                Each port must have a name direction and a type

                An entity may have NO port declaration

                35

                Port directions In A value of a port can be read inside the component but cannot be assigned

                Multiple reads of port are allowed

                Out Assignments can be made to a port but data from a port cannot be read

                Multiple assignments are allowed

                In out Bi-directional assignments can be made and data can be read Multiple

                assignments are allowed

                Buffer An out port with read capability May have at most one assignment

                (are not recommended)

                Architectures Every entity has at least one architecture

                One entity can have several architectures

                Architectures can describe design using

                BehaviorndashStructurendashDataflow

                Architectures can describe design on many levelsndashGate levelndashRTL (Register

                Transfer Level)ndashBehavioral level

                Configuration declaration links architecture to entity

                Eg

                Architecture Comparator1 of Comparator is

                Begin

                EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                End Comparator1

                Configurations Links entity declaration and architecture body together

                Concept of default configuration is a bit messy in VHDL lsquo87

                ndashLast architecture analyzed links to entity

                Can be used to change simulation behavior without re-analyzing the VHDL

                source

                Complex configuration declarations are ignored in synthesis

                36

                Some entities can have eggate level architecture and behavioral

                architecture

                Are always optional

                PackagesPackages contain information common to many design units

                1 Package declaration

                --constant declarations

                ndashtype and subtype declarations

                ndashfunction and procedure declarations

                ndashglobal signal declarations

                ndashfile declarations

                ndashcomponent declarations

                2 Package body

                ndashis not necessary needed

                ndashfunction bodies

                ndashprocedure bodies

                Packages are meant for encapsuling data which can be shared globally among

                Several design units

                Consists of declaration part and optional body part

                Package declaration can contain

                ndashtype and subtype declarations

                ndashsubprograms

                ndashconstants

                Alias declarations

                ndashglobal signal declarations

                ndashfile declarations

                ndashcomponent declarations

                37

                Package body consists of

                ndashsubprogram declarations and bodies

                ndashtype and subtype declarations

                ndash deferred constants

                ndash file declarations

                LibrariesCollection of VHDL design units (database)

                1 Packages

                package declaration

                package body

                2 Entities (entity declaration)

                3 Architectures (architecture body)

                4 Configurations (configuration declarations)

                Usually directory in UNIX file system

                Can be also any other kind of database

                Levels of Abstraction

                VHDL supports many possible styles of design description which differ

                primarily in how closely they relate to the HW

                It is possible to describe a circuit in a number of ways

                Structural-------

                Dataflow ------- Higher level of abstraction

                Behavioral -------

                Structural VHDL description

                Circuit is described in terms of its components

                From a low-level description (eg transistor-level description) to a high level

                description (eg block diagram)

                For large circuits a low-level description quickly becomes impractical

                103 Dataflow VHDL Description

                38

                Circuit is described in terms of how data moves through the system

                In the dataflow style you describe how information flows between registers in

                the system

                The combinational logic is described at a relatively high level the placement

                and operation of registers is specified quite precisely

                The behavior of the system over the time is defined by registers

                There are no build-in registers in VHDL-language

                ndashEither lower level description

                ndashor behavioral description of sequential elements is needed

                The lower level register descriptions must be created or obtained

                If there is no 3rd party models for registers =gt you must write the behavioral

                description of registers

                The behavioral description can be provided in the form of

                subprograms(functions or procedures)

                104 Behavioral VHDL Description

                Circuit is described in terms of its operation over time

                39

                Representation might include eg state diagrams timing diagrams and

                algorithmic descriptions

                The concept of time may be expressed precisely using delays (eg A lt= B

                after 10 ns)

                If no actual delay is used order of sequential operations is defined

                In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                timing specifications

                The actual timing results depend on implementation technology and efficiency

                of synthesis tool

                There are a few tools for behavioral synthesis

                Concurrent Vs Sequential

                Processes

                Basic simulation concept in VHDL

                VHDL description can always be broken up to interconnected processes

                Quite similar to Unix process

                40

                Process keyword in VHDL

                Process statement is concurrent statement

                Statements inside process statements are sequential statements

                Process must contain either sensitivity list or wait statement(s) but NOT both

                Sensitivity list or wait statement(s) contains signals which wakes process up

                General format

                Process [(sensitivity_list)]

                process_declarative_part

                Begin

                process_statements

                [wait_statement]

                End process

                41

                11 VGA CONTROLLER CODE

                library IEEE

                use IEEESTD_LOGIC_1164ALL

                use IEEESTD_LOGIC_ARITHALL

                use IEEESTD_LOGIC_UNSIGNEDALL

                use IEEEnumeric_stdALL

                ---- Uncomment the following library declaration if instantiating

                ---- any Xilinx primitives in this code

                --library UNISIM

                --use UNISIMVComponentsall

                entity vga_controller is port(clk50min std_logic

                rstnin std_logic

                hsyncout std_logic

                vsyncout std_logic

                redout std_logic_vector(0 downto 0)

                greenout std_logic_vector(0 downto 0)

                blueout std_logic_vector(0 downto 0))

                end vga_controller

                architecture Behavioral of vga_controller is

                component memory

                port (

                addr IN std_logic_VECTOR(17 downto 0)

                clk IN std_logic

                dout OUT std_logic_VECTOR(0 downto 0)

                en IN std_logic

                sinit IN std_logic)

                END component

                42

                component sync_unit port(

                clkin std_logic

                rstnin std_logic

                hsyncout std_logic

                ---horizontal synch pulse(row)

                vsyncout std_logic

                --vertical sync pulse(frame) 111110100000000000

                v_en_regout std_logic

                h_enout std_logic

                v_enout std_logic

                line_countinout integer

                c_hsinout integer)

                end component

                component pixelgen port(

                clkin std_logic

                rstnin std_logic

                hsyncin std_logic

                vsyncin std_logic

                h_enin std_logic

                v_enin std_logic

                v_en_reg in std_logic

                data_inin std_logic_vector(0 downto 0)

                line_countin integer

                c_hsin integer

                red out std_logic_vector(0 downto 0)

                greenout std_logic_vector(0 downto 0)

                blueout std_logic_vector(0 downto 0)

                addrout std_logic_vector(17 downto 0)

                rdout std_logic)

                end component

                43

                signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                signal line_count_sc_hs_saddrinteger

                signal data_sstd_logic_vector(0 downto 0)

                signal addr_sstd_logic_vector(17 downto 0)

                begin

                --addr_s lt= std_logic_vector(to_unsigned(addr18))

                process(clk50mrstn)

                begin

                if(rstn = 0)then

                clk lt= 0

                elsif(clk50m = 1 and clk50mevent)then

                clk lt= not clk

                end if

                --end if

                end process

                hsync lt= hsync_s

                vsync lt= vsync_s

                sync sync_unit port map(

                clk =gt clk

                rstn =gt rstn

                hsync =gt hsync_s

                vsync =gt vsync_s

                v_en_reg =gt v_en_reg_s

                line_count =gt line_count_s

                h_en =gt h_en_s

                v_en =gt v_en_s

                c_hs =gt c_hs_s

                )

                44

                pixel pixelgen port map(

                clk =gt clk

                rstn =gt rstn

                hsync =gt hsync_s

                vsync =gt vsync_s

                h_en =gt h_en_s

                v_en =gt v_en_s

                v_en_reg =gt v_en_reg_s

                data_in =gt data_s

                line_count=gt line_count_s

                c_hs =gt c_hs_s

                red =gt red

                green =gt green

                blue =gt blue

                addr =gt addr_s

                rd =gt rd_s

                )

                mem12 memory port map(

                addr =gt addr_s

                clk =gt clk

                dout =gt data_s

                en =gt rd_s

                sinit=gt rstn

                )

                end Behavioral

                45

                SYNCHRONIZATION BLOCK CODE

                library IEEE

                use IEEESTD_LOGIC_1164ALL

                use IEEESTD_LOGIC_ARITHALL

                use IEEESTD_LOGIC_UNSIGNEDALL

                entity sync_unit is port(

                clkin std_logic

                rstnin std_logic

                hsyncout std_logic ---horizontal synch pulse(row)

                vsyncout std_logic --vertical sync

                pulse(frame) 111110100000000000

                v_en_regout std_logic

                h_enout std_logic

                v_enout std_logic

                line_countout integer

                c_hsout integer )

                end sync_unit

                architecture arch_sync_unit of sync_unit is

                signal h_en_sv_en_sstd_logic

                signal line_count_sc_hs_sinteger

                begin

                c_hs lt= c_hs_s

                line_count lt= line_count_s

                h_en lt= h_en_s

                v_en lt= v_en_s

                -- hsync counter

                process(clkrstn)

                begin

                if(rstn = 0)then

                46

                c_hs_s lt= 0

                else

                if(clk = 1 and clkevent)then

                if(c_hs_s=793)then

                c_hs_s lt= 0

                else

                c_hs_s lt= c_hs_s +1

                end if

                end if

                end if

                end process

                --vsync counter

                --vertical line counter

                process(h_en_srstn)

                begin

                if(rstn = 0)then

                line_count_s lt= 0

                elsif(h_en_s = 1 and h_en_sevent)then

                line_count_s lt= line_count_s + 1

                if (line_count_s=515)then

                line_count_s lt= 0

                end if

                end if

                --end if

                end process

                --hysnc pulse

                process(clkrstn)

                begin

                if(rstn = 0)then

                47

                hsync lt= 1

                elsif(clk = 1 and clkevent)then

                if(c_hs_s lt=95)then

                hsync lt= 0

                else

                hsync lt= 1

                end if

                end if

                --end if

                end process

                process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                this period)

                begin

                if(rstn = 0)then

                h_en_s lt= 0

                elsif(clk = 1 and clkevent)then

                if((c_hs_s gt=143) and (c_hs_slt=778))then

                h_en_s lt= 1

                else

                h_en_s lt= 0

                end if

                end if

                --end if

                end process

                ---vysnc pulse

                process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                begin

                if(rstn = 0)then

                48

                v_en_s lt= 0

                elsif(clk = 1 and clkevent)then

                if((line_count_sgt=35 and line_count_slt=515))then

                v_en_s lt= 1

                else

                v_en_s lt= 0

                end if

                end if

                --end if

                end process

                process(clkrstn) --Vertical Sync Front porch Back Porch

                begin

                if(rstn = 0)then

                vsync lt= 1

                elsif(clk = 1 and clkevent)then

                if(line_count_s lt2)then

                vsync lt= 0

                else

                vsync lt= 1

                end if

                end if

                --end if

                end process

                process(clk)

                begin

                if(clk = 1 and clkevent)then

                v_en_reglt=v_en_s

                end if

                end process

                end arch_sync_unit

                49

                PIXEL GENERATOR

                library IEEE

                use IEEESTD_LOGIC_1164ALL

                use IEEESTD_LOGIC_ARITHALL

                use IEEESTD_LOGIC_UNSIGNEDALL

                entity pixelgen is port(

                clkin std_logic

                rstnin std_logic

                hsyncin std_logic

                vsyncin std_logic

                h_enin std_logic

                v_enin std_logic

                v_en_reg in std_logic

                data_inin std_logic_vector(0 downto 0)

                line_countin integer

                c_hsin integer

                red out std_logic_vector(0 downto 0)

                greenout std_logic_vector(0 downto 0)

                blueout std_logic_vector(0 downto 0)

                addrout std_logic_vector(17 downto 0)

                rdout std_logic

                )

                end pixelgen

                architecture behv of pixelgen is

                signal addr_sstd_logic_vector(17 downto 0)

                begin

                addr lt= addr_s

                process(clkrstn)

                begin

                50

                if(rstn = 0)then

                rd lt= 0

                red lt= 0

                green lt= 0

                blue lt= 0

                addr_s lt= x0000 amp 00

                elsif (clk = 1 and clkevent)then

                if((v_en=1) and (v_en_reg=0))then

                addr_slt=(others =gt0)

                end if

                if(v_en=1 and h_en=1)then

                if(line_countlt=514)then

                if(c_hs lt= 655)then

                rd lt= 1

                red lt= data_in

                green lt= data_in

                blue lt= data_in

                addr_s lt=addr_s +000000000000000001

                else

                rd lt= 0

                red lt= 0

                green lt= 1

                blue lt= 0

                end if

                else

                rd lt= 0

                red lt= 1

                green lt= 1

                blue lt= 1

                51

                end if

                if(addr_s=111011111111111111)then

                addr_slt= (others =gt0)

                end if

                else

                red lt= 0

                green lt= 0

                blue lt= 0

                end if

                end if

                -- end if

                --end if

                end process

                end behv

                52

                12 FPGA ARCHITECTURE

                A field-programmable gate array (FPGA) is a semiconductor device that can

                be configured by the customer or designer after manufacturingmdashhence the

                name field-programmable FPGAs are programmed using a logic circuit

                diagram or a source code in a hardware description language (HDL) to specify

                how the chip will work They can be used to implement any logical function

                that an application-specific integrated circuit (ASIC) could perform but the

                ability to update the functionality after shipping offers advantages for many

                applications

                FPGAs contain programmable logic components called logic blocks and a

                hierarchy of reconfigurable interconnects that allow the blocks to be wired

                togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                can be configured to perform complex combinational functions or merely

                simple logic gates like AND and XOR In most FPGAs the logic blocks also

                include memory elements which may be simple flip-flops or more complete

                blocks of memory

                History The FPGA industry sprouted from programmable read only

                memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                both had the option of being programmed in batches in a factory or in the field

                (field programmable) however programmable logic was hard-wired between

                logic gates

                Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                first commercially viable field programmable gate array in 1985 ndash the

                XC2064 The XC2064 had programmable gates and programmable

                interconnects between gates the beginnings of a new technology and market

                The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                input lookup tables (LUTs) More than 20 years later Freeman was entered

                into the National Inventors Hall of Fame for his invention

                53

                Some of the industryrsquos foundational concepts and technologies for

                programmable logic arrays gates and logic blocks are founded in patents

                awarded to David W Page and LuVerne R Peterson in 1985

                In the late 1980s the Naval Surface Warfare Department funded an experiment

                proposed by Steve Casselman to develop a computer that would implement

                600000 reprogrammable gates Casselman was successful and the system was

                awarded a patent in 1992

                Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                1990s when competitors sprouted up eroding significant market-share By

                1993 Actel was serving about 18 percent of the market

                The 1990s were an explosive period of time for FPGAs both in sophistication

                and the volume of production In the early 1990s FPGAs were primarily used

                in telecommunications and networking By the end of the decade FPGAs

                found their way into consumer automotive and industrial applications

                FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                algorithm technology and FPGAs to create a sound recognition device

                Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                to decide the configuration needed to accomplish a sound recognition task

                121 Modern developments

                A recent trend has been to take the coarse-grained architectural

                approach a step further by combining the logic blocks and interconnects of

                traditional FPGAs with embedded microprocessors and related peripherals to

                form a complete system on a programmable chip This work mirrors the

                architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                Group which combined a reconfigurable CPU architecture on a single chip

                called the SB24 That work was done in 1982 Examples of such hybrid

                technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                which include one or more PowerPC processors embedded within the FPGAs

                54

                logic fabric The Atmel FPSLIC is another such device which uses an AVR

                processor in combination with Atmels programmable logic architecture

                An alternate approach to using hard-macro processors is to make use of soft

                processor cores that are implemented within the FPGA logic (See Soft

                processors below)

                As previously mentioned many modern FPGAs have the ability to be

                reprogrammed at run time and this is leading to the idea of reconfigurable

                computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                example of a reconfigurable soft processor implemented on FPGAs However

                it does not support dynamic reconfiguration at runtime but instead adapts itself

                to a specific program

                Additionally new non-FPGA architectures are beginning to emerge

                Software-configurable microprocessors such as the Stretch S5000 adopt a

                hybrid approach by providing an array of processor cores and FPGA-like

                programmable cores on the same chip

                Gates

                1987 9000 gates Xilinx

                1992 600000 Naval Surface Warfare Department

                Early 2000s Millions

                Market size

                1985 First commercial FPGA technology invented by Xilinx

                1987 $14 million

                ~1993 gt$385 million

                2005 $19 billion

                2010 estimates $275 billion

                55

                122 FPGA Comparisons

                Historically FPGAs have been slower less energy efficient and generally

                achieved less functionality than their fixed ASIC counterparts A combination

                of volume fabrication improvements research and development and the IO

                capabilities of new supercomputers have largely closed the performance gap

                between ASICs and FPGAs

                Advantages include a shorter time to market ability to re-program in the field

                to fix bugs and lower non-recurring engineering costs Vendors can also take a

                middle road by developing their hardware on ordinary FPGAs but

                manufacture their final version so it can no longer be modified after the design

                has been committed

                Xilinx claims that several market and technology dynamics are changing the

                ASICFPGA paradigm

                IC costs are rising aggressively

                ASIC complexity has bolstered development time and costs

                RampD resources and headcount is decreasing

                Revenue losses for slow time-to-market are increasing

                Financial constraints in a poor economy are driving low-cost technologies

                These trends make FPGAs a better alternative than ASICs for a growing

                number of higher-volume applications than they have been historically used

                for which the company blames for the growing number of FPGA design starts

                (see History)

                The primary differences between CPLDs and FPGAs are architectural A

                CPLD has a somewhat restrictive structure consisting of one or more

                programmable sum-of-products logic arrays feeding a relatively small number

                of clocked registers The result of this is less flexibility with the advantage of

                more predictable timing delays and a higher logic-to-interconnect ratio The

                FPGA architectures on the other hand are dominated by interconnect This

                56

                makes them far more flexible (in terms of the range of designs that are

                practical for implementation within them) but also far more complex to design

                for

                Another notable difference between CPLDs and FPGAs is the presence in most

                FPGAs of higher-level embedded functions (such as adders and multipliers)

                and embedded memories as well as to have logic blocks implement decoders

                or mathematical functions

                Some FPGAs have the capability of partial re-configuration that lets one

                portion of the device be re-programmed while other portions continue running

                123 Applications

                Applications of FPGAs include digital signal processing software-defined

                radio aerospace and defense systems ASIC prototyping medical imaging

                computer vision speech recognition cryptography bioinformatics computer

                hardware emulation radio astronomy and a growing range of other areas

                FPGAs originally began as competitors to CPLDs and competed in a similar

                space that of glue logic for PCBs As their size capabilities and speed

                increased they began to take over larger and larger functions to the state where

                some are now marketed as full systems on chips (SoC) Particularly with the

                introduction of dedicated multipliers into FPGA architectures in the late 1990s

                applications which had traditionally been the sole reserve of DSPs began to

                incorporate FPGAs instead

                FPGAs especially find applications in any area or algorithm that can make use

                of the massive parallelism offered by their architecture One such area is code

                breaking in particular brute-force attack of cryptographic algorithms

                FPGAs are increasingly used in conventional high performance computing

                applications where computational kernels such as FFT or Convolution are

                performed on the FPGA instead of a microprocessor

                57

                The inherent parallelism of the logic resources on an FPGA allows for

                considerable computational throughput even at a low MHz clock rates The

                flexibility of the FPGA allows for even higher performance by trading off

                precision and range in the number format for an increased number of parallel

                arithmetic units This has driven a new type of processing called

                reconfigurable computing where time intensive tasks are offloaded from

                software to FPGAs

                The adoption of FPGAs in high performance computing is currently limited by

                the complexity of FPGA design compared to conventional software and the

                extremely long turn-around times of current design tools where 4-8 hours wait

                is necessary after even minor changes to the source code

                Traditionally FPGAs have been reserved for specific vertical applications

                where the volume of production is small For these low-volume applications

                the premium that companies pay in hardware costs per unit for a

                programmable chip is more affordable than the development resources spent

                on creating an ASIC for a low-volume application Today new cost and

                performance dynamics have broadened the range of viable applications

                Architecture

                The most common FPGA architecture consists of an array of configurable

                logic blocks (CLBs) IO pads and routing channels Generally all the routing

                channels have the same width (number of wires) Multiple IO pads may fit

                into the height of one row or the width of one column in the array

                An application circuit must be mapped into an FPGA with adequate resources

                While the number of CLBs and IOs required is easily determined from the

                design the number of routing tracks needed may vary considerably even

                among designs with the same amount of logic (For example a crossbar switch

                requires much more routing than a systolic array with the same gate count)

                Since unused routing tracks increase the cost (and decrease the performance)

                of the part without providing any benefit FPGA manufacturers try to provide

                58

                just enough tracks so that most designs that will fit in terms of LUTs and IOs

                can be routed This is determined by estimates such as those derived from

                Rents rule or by experiments with existing designs

                The FPGA is an array or island-style FPGA It consists of an array of logic

                blocks and routing channels Two IO pads fit into the height of one row or the

                width of one column as shown below All the routing channels have the same

                width (number of wires)

                121 FPGA structure

                A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                flip-flop as shown below In recent years manufacturers have started moving

                to 6-input LUTs in their high performance parts claiming increased

                performance

                Typical logic block

                59

                There is only one output which can be either the registered or the unregistered

                LUT output The logic block has four inputs for the LUT and a clock input

                Since clock signals (and often other high-fanout signals) are normally routed

                via special-purpose dedicated routing networks in commercial FPGAs they

                and other signals are separately managed

                For this example architecture the locations of the FPGA logic block pins are

                shown below

                Logic Block Pin Locations

                Each input is accessible from one side of the logic block while the output pin

                can connect to routing wires in both the channel to the right and the channel

                below the logic block

                Each logic block output pin can connect to any of the wiring segments in the

                channels adjacent to it

                Similarly an IO pad can connect to any one of the wiring segments in the

                channel adjacent to it For example an IO pad at the top of the chip can

                connect to any of the W wires (where W is the channel width) in the horizontal

                channel immediately below it

                Generally the FPGA routing is unsegmented That is each wiring segment

                spans only one logic block before it terminates in a switch box By turning on

                60

                some of the programmable switches within a switch box longer paths can be

                constructed For higher speed interconnect some FPGA architectures use

                longer routing lines that span multiple logic blocks

                Whenever a vertical and a horizontal channel intersect there is a switch box In

                this architecture when a wire enters a switch box there are three

                programmable switches that allow it to connect to three other wires in adjacent

                channel segments The pattern or topology of switches used in this

                architecture is the planar or domain-based switch box topology In this switch

                box topology a wire in track number one connects only to wires in track

                number one in adjacent channel segments wires in track number 2 connect

                only to other wires in track number 2 and so on The figure below illustrates

                the connections in a switch box

                Switch box topology

                Modern FPGA families expand upon the above capabilities to include higher

                level functionality fixed into the silicon Having these common functions

                embedded into the silicon reduces the area required and gives those functions

                increased speed compared to building them from primitives Examples of these

                include multipliers generic DSP blocks embedded processors high speed IO

                logic and embedded memories

                61

                FPGAs are also widely used for systems validation including pre-silicon

                validation post-silicon validation and firmware development This allows chip

                companies to validate their design before the chip is produced in the factory

                reducing the time to market

                124 FPGA Design and Programming

                To define the behavior of the FPGA the user provides a hardware description

                language (HDL) or a schematic design The HDL form might be easier to work

                with when handling large structures because its possible to just specify them

                numerically rather than having to draw every piece by hand On the other hand

                schematic entry can allow for easier visualisation of a design

                Then using an electronic design automation tool a technology-mapped netlist

                is generated The netlist can then be fitted to the actual FPGA architecture

                using a process called place-and-route usually performed by the FPGA

                companys proprietary place-and-route software The user will validate the

                map place and route results via timing analysis simulation and other

                verification methodologies Once the design and validation process is

                complete the binary file generated (also using the FPGA companys

                proprietary software) is used to (re)configure the FPGA

                Going from schematicHDL source files to actual configuration The source

                files are fed to a software suite from the FPGACPLD vendor that through

                different steps will produce a file This file is then transferred to the

                FPGACPLD via a serial interface (JTAG) or to an external memory device

                like an EEPROM

                The most common HDLs are VHDL and Verilog although in an attempt to

                reduce the complexity of designing in HDLs which have been compared to the

                equivalent of assembly languages there are moves to raise the abstraction level

                through the introduction of alternative languages

                62

                To simplify the design of complex systems in FPGAs there exist libraries of

                predefined complex functions and circuits that have been tested and optimized

                to speed up the design process These predefined circuits are commonly called

                IP cores and are available from FPGA vendors and third-party IP suppliers

                (rarely free and typically released under proprietary licenses) Other

                predefined circuits are available from developer communities such as Open

                Cores (typically free and released under the GPL BSD or similar license) and

                other sources

                In a typical design flow an FPGA application developer will simulate the

                design at multiple stages throughout the design process Initially the RTL

                description in VHDL or Verilog is simulated by creating test benches to

                simulate the system and observe results Then after the synthesis engine has

                mapped the design to a netlist the netlist is translated to a gate level

                description where simulation is repeated to confirm the synthesis proceeded

                without errors Finally the design is laid out in the FPGA at which point

                propagation delays can be added and the simulation run again with these

                values back-annotated onto the netlist

                Basic Process Technology Types

                SRAM - based on static memory technology In-system programmable and re-

                programmable Requires external boot devices CMOS

                Antifuse - One-time programmable CMOS

                EPROM - Erasable Programmable Read-Only Memory technology Usually

                one-time programmable in production because of plastic packaging

                Windowed devices can be erased with ultraviolet (UV) light CMOS

                EEPROM - Electrically Erasable Programmable Read-Only Memory

                technology Can be erased even in plastic packages Some but not all

                EEPROM devices can be in-system programmed CMOS

                Flash - Flash-erase EPROM technology Can be erased even in plastic

                packages Some but not all flash devices can be in-system programmed

                63

                Usually a flash cell is smaller than an equivalent EEPROM cell and is

                therefore less expensive to manufacture CMOS

                Fuse - One-time programmable Bipolar

                Major Manufacturers

                Xilinx and Altera are the current FPGA market leaders and long-time industry

                rivals Together they control over 80 percent of the market with Xilinx alone

                representing over 50 percent

                Xilinx also provides free Windows and Linux design software while Altera

                provides free Windows tools the Solaris and Linux tools are only available via

                a rental scheme

                Other competitors include Lattice Semiconductor (flash SRAM) Actel

                (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                focused CSSP no general purpose FPGAs)

                125 FPGA prototype

                FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                prototyping is the method to prototype SoC and ASIC design on FPGA for

                hardware verification and early software development

                Main stream verification methods for hardware design and early software and

                firmware co-design has become mainstream Prototyping SoC and ASIC

                design on FPGA has become a good method to do this

                64

                Reasons why Prototyping is important

                1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                functionally correct This is compared to designers only relying on software

                simulations to verify that their hardware design is sound Simulation speed and

                modeling accuracy limitations hinder this development

                2 Due to time constrains many projects cannot wait until the silicon is back from

                the foundry to start on software tests FPGA prototyping allows for much more

                time in area of software development and testing at the software-hardware

                integration stage This allows many unforeseen software bugs that appear due

                to todays array of operating systems applications and hardware

                3 Prototyping also allows the developer to ensure that all IP technologies on his

                system work well together off the simulation stage and in actual form

                4 Prototyping has the added advantage as demo platforms to SoC clients

                bringing in interest early This speeds up the overall development cycle and

                allows for more enhancement or improvement to the chip features as it would

                otherwise have been

                65

                13 SIMULATION RESULTS

                66

                GATE LEVEL

                FigVGA CONTROLLER

                67

                GATE LEVEL

                Fig SYNC UNIT

                68

                TECHNOLOGY SCHEMATIC

                69

                14 APPLICATIONS

                Motion Tracking Systems

                Image processing systems

                Displaying Systems

                141 CONCLUTION

                Hence by implementing FPGA taking a processor-generated picture

                (pixilated) from memory space the Controller provides digital RGB values for

                each pixel as well as horizontal and vertical synchronization signals in order

                to correctly display the picture on a connected monitor

                70

                REFERENCES

                1 httpenwikipediaorgwikiVideo_Graphics_Array

                2 A VGA display controller by Eduardo Sanchez

                3 HinnerrdquoVGA Interface and video signal documentsrdquo

                71

                • Pixel
                  • 6 SIGNALS IN VGA CONTROLLER
                  • Gates
                  • Market size
                  • 122 FPGA Comparisons

                  The Synchronization Unit provides the horizontal and vertical

                  synchronization signals ndash HSYNC and VSYNC ndash that are required to correctly

                  display a picture frame within the confines of a monitorrsquos display area

                  These synchronization signals are used as control inputs by the

                  monitorrsquos horizontal and vertical deflection circuits These circuits deflect the

                  electrons emitted by the three primary color electron guns (Red Green Blue)

                  left to right and from top to bottom respectively HSYNC provides the start

                  and stop times for the horizontal deflection circuit so that a line of pixels is

                  correctly drawn across the screen display VSYNC provides the start and stop

                  times for the vertical deflection circuit so that the lines of a frame are correctly

                  drawn from the top to the bottom of the screen display

                  The resolution for the display is defined by the level on the

                  RESOLUTION input If High the 640x480 resolution is used (VGA) If Low

                  the 800x600 resolution (SVGA) is used

                  Although the resolution determines the area of a monitorrsquos screen

                  within which an image can be displayed the full extents of the chosen

                  resolution do not have to be used The actual extents of the image display area

                  on the screen can be controlled by the use of the DISPSIZE_H and

                  DISPSIZE_V inputs These inputs determine the total number of pixels to be

                  used in a line and the total number of lines in a frame respectively

                  52 Blank pixel generation

                  The total number of viewable pixels in a line and viewable lines in a

                  frame is determined by the display resolution chosen through the

                  RESOLUTION input (1= 640x480 0 = 800x600) and the values received on

                  the DISPSIZE_H and DISPSIZE_V buses Whether the full extent of the

                  chosen display resolution is used or not the areas of the monitor screen to the

                  top bottom left and right of the viewable frame area are blanked by putting

                  black pixels at the required line-pixel positions This has the effect of centering

                  the image on the screen

                  9

                  The color generated for a pixel in the Pixel Unit depends on whether

                  the particular pixel requires to be blanked or not The Synchronization Unit

                  provides a signal to the Pixel Unit for this very reason This is the line display

                  enable signal - en The signal is checked on each rising edge of the external

                  clock signal (CLK) and is set as follows

                  If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

                  en = 0 (pixel requires to be blanked ndash set color to be black)

                  Else

                  en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

                  6 SIGNALS IN VGA CONTROLLER

                  For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

                  timings are

                  Parameter Value Unit

                  Clock frequency 25175 MHz

                  Horizontal pixels 640

                  Horizontal sync polarity Negative

                  Total time for each line 3177 micros

                  Front porch (A) 094 micros

                  Sync pulse length (B) 377 micros

                  Back porch (C) 189 micros

                  Active video (D) 2517 micros

                  (Total horizontal sync time 660 micros)

                  10

                  61 The vertical timings are

                  Parameter Value Unit

                  Vertical lines 480

                  Vertical sync polarity Negative

                  Vertical frequency 5994 Hz

                  Front porch (E) 035 ms

                  Sync pulse length (F) 006 ms

                  Back porch (G) 102 ms

                  Active video (H) 1525 ms

                  (Total vertical sync time 143 ms)

                  640 x 400 70 Hz is video mode used for booting most x86 personal

                  computers

                  640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

                  It should be noted that the actual timings vary slightly For example for

                  640x480 60fps a 2517 micros active video time with a pixel frequency of

                  25174 MHz gives 633 pixels rather than the expected 640 pixels

                  62 Generation of the horizontal synchronization signal ndash HSYNC

                  11

                  The HSYNC signal is High (inactive) after an external reset signal

                  (RST) is received by the VGA Controller The signal is updated on each rising

                  edge of the external clock signal (CLK)

                  The state of the HSYNC signal depends on the value stored in the horizontal

                  counter and is driven low when

                  HCOUNT ge (PixTotal + BlankingLeft)

                  and remains low while

                  HCOUNT lt (PixTotal + BlankingLeft + TLSync)

                  Vertical (Frame) Period

                  63 Generation of the vertical synchronization signal - VSYNC

                  The VSYNC signal is High (inactive) after an external reset signal

                  (RST) is received by the VGA Controller The signal is updated after every

                  line of pixels is completed (ie on each rising edge of the HSYNC signal)

                  The state of the VSYNC signal depends on the value stored in the vertical

                  counter and is driven low when

                  VCOUNT ge (LinTotal + BlankingLeft)

                  and remains low while

                  VCOUNT lt (LinTotal + BlankingLeft + TFSync)

                  Address Counter (MEMADDR)

                  The address counter is used to store the position of the next consecutive

                  pixel in the frame Its value is passed to the Pixel Unit on the internal bus

                  signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

                  the next pixel from picture memory

                  The counter is reset to zero when the VGA Controller receives an external

                  reset signal (RST) The size of the counter depends on the values chosen for

                  DISPSIZE_H and DISPSIZE_V as the range is simply

                  12

                  0 to (PixTotal x LinTotal) - 1

                  Taking the maximum number of pixels in a line and lines in a frame for each

                  of the supported display resolutions the maximum ranges for the counter are

                  bull 640x480 0 to 307199

                  bull 800x600 0 to 479999

                  The counter has 19-bit resolution

                  While the value in the horizontal counter (HCOUNT) is less than the

                  total number of viewable pixels in a line (PixTotal the integer value of

                  DISPSIZE_H) the counter is incremented on the rising edge of the external

                  clock signal (CLK) Pixel addressing within the frame is consecutive When

                  the counter reaches the last pixel in a line its incremented value is the first

                  pixel in the next line down

                  The address counter will continue to be incremented until the value in the

                  vertical counter (VCOUNT) is greater than or equal to the total number of

                  viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

                  point it will be rolled over to zero

                  13

                  7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

                  microcontroller and any VGA-compatible monitor This variant of the

                  Controller provides six modes of display depending on the resolution chosen

                  (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

                  White 16 Colors or 64 Colors)

                  71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

                  For 1sec = 60 frames

                  1 frame = max 160 sec

                  1 pixel = 160 sec = 50ns

                  307200

                  Max Time for 1 pixel = 40ns

                  2) To send information for each and every pixel of a frame by means of

                  RGB signals

                  3) After completely displaying 1 row information the column must be

                  refreshed and the row has to be incremented ie done by Hsync signal

                  4) After completing 1 frame of information a new frame has to be started

                  for this row and column are to be refreshed ie done by Vsync signal

                  14

                  =gt For giving information for 1 pixel we use 1 clock (for controlling all

                  operations)

                  1 clock = 1 pixel

                  1 clock cycle period = 40ns

                  Clock frequency = 1time period =140ns = 25 mhz

                  1 clock frequency = 25 mhz

                  So if we use 25mhz clock and display 1 pixel information for each and enery

                  clock then we use can attain min 60hz refreshing rate

                  do not use the clock frequency less than 25mhz

                  Ex

                  15MHz 25MHz

                  30MHz

                  Time period = more effectively utilized frequency

                  Less

                  No of framessec = less

                  More

                  Decreases

                  Increases

                  (It is unable to display 60framessec)

                  Signals used for image display-

                  R (Red)

                  15

                  G (Green)

                  B (Blue)

                  HSYNC (Horizontal Synchronous signal)

                  VSYNC (Vertical Synchronous signal)

                  All the above signals are used to control the image data to be correctly

                  displayed on the VGA display monitor

                  72 Timings for front port sync pulse back port-

                  Horizontal sync Vertical sync

                  Front port 094micros Front port 035ms

                  Sync pulse 377 micros Sync pulse 0063ms

                  Back port 189 micros Back port 102ms

                  16

                  8 FUNCTIONAL DESCRIPTION81 Symbol

                  VGA Controller Symbol

                  Pin description

                  The pin description is shown in the below table

                  Table VGA Pin description

                  Name Type Polarity Bus size Description

                  Global Control Signals

                  CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                  17

                  640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                  RST I High Global system reset

                  VGA Input Settings Signals

                  RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                  CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                  DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                  DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                  Data Memory Control Signals

                  RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                  18

                  the internal line enable signal en generated by the Synchronization Unit of the Controller

                  DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                  ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                  VGA Monitor Control Signals

                  HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                  VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                  19

                  VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                  R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                  G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                  S

                  B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                  20

                  9 VGA SIGNAL TIMING

                  Figure 3 summarizes the signal timing involved in sending a line of

                  pixels and a frame of lines The actual time values differ according to the

                  resolution selected (640x480 or 800x600) the processor-defined values for

                  DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                  (CLK)

                  21

                  91 VGA PIXEL UNIT

                  The Pixel Unit provides access to the pixilated image stored in external

                  picture memory reading in data a byte at a time and formatting each pixel to

                  be displayed For each pixel the 6-bit RGB value required for the control of

                  the monitorrsquos three primary color electron guns is generated so that the pixel

                  is displayed on the screen with the correct color

                  Accessing the stored image

                  The image to be displayed on the monitor screen is written by the host

                  microcontroller into external memory space (RAM) This memory space can

                  be located anywhere (eg using a block of RAM within the FPGA design or

                  using the SRAM on the NanoBoard itself)

                  Pixel data is stored in the memory space in bytes The number of pixels in a

                  byte depends on which color palette is being used for the image (selected by

                  the CMOD input)

                  bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                  bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                  bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                  (DATA[76] are not used)

                  22

                  The size of memory required to store a picture is determined by the total

                  number of viewable pixels in a line (determined by DISPSIZE_H) the total

                  number of viewable lines in a frame (determined by DISPSIZE_V) and the

                  number of pixels stored in each byte in memory space

                  Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                  The address in RAM where the next pixel is stored is determined using

                  an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                  reflects the current contents of the MEMADDR register The exact addressing

                  is described below

                  92 Black amp White mode

                  The picture memory address ndash the byte of data containing the next 8 pixels

                  ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                  three

                  ADDR_PIXEL = 00 amp hvcnt[183]

                  16 Colors mode

                  The picture memory address ndash the byte of data containing the next 2 pixels

                  ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                  one

                  ADDR_PIXEL = 0 amp hvcnt[181]

                  64 Colors mode

                  The picture memory address ndash the byte of data containing the next pixel ndash is

                  determined by using the full value of hvcnt

                  ADDR_PIXEL = hvcnt[180]

                  The Pixel register (PIXREG)

                  The Pixel register is used to receive the byte of pixel data read from the

                  current address in memory space The register PIXREG is reset to zero when

                  the VGA Controller receives an external reset signal (RST)

                  23

                  The register is updated on each rising edge of the CLK signal Data can be

                  read from the memory space as long as the RD signal is active (High) The RD

                  signal is itself controlled by the external line display enable signal enex This

                  internally generated signal is defined as follows

                  If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                  (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                  Else

                  enex = 1

                  When enex is Low read access from memory is disabled (RD = 0)

                  The point at which data is loaded from memory into PIXREG depends on the

                  particular color palette that is chosen ndash Black amp White 16 Colors or 64

                  Colors

                  Black amp White mode

                  The next byte of data will be loaded into the register whenever the

                  lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                  are ldquo000rdquo

                  For the currently loaded byte the active pixel is always in the lowest bit

                  position of the Pixel register Each pixel in the data byte is moved into this

                  active pixel position by shifting the contents of the register right by one bit on

                  each rising edge of CLK

                  16 Colors mode

                  The next byte of data will be loaded into the register whenever the

                  lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                  0

                  For the currently loaded byte the active pixel is always in the low order

                  nibble of the Pixel register Remember that in this mode each byte of data

                  contains two pixels The second pixel is moved into this active pixel position

                  by shifting the contents of the register right by four bits on the rising edge of

                  CLK

                  24

                  64 Colors mode

                  The next byte of data will be loaded into the register on the rising edge

                  of the external system clock signal (CLK) In this mode the read of pixel data

                  does not depend on the status of the hvcnt signal received from the

                  Synchronization Unit

                  93 The RGB register (RGBREG)

                  The RGB register is used to store the six bits that are required for

                  driving the red green and blue color guns of the target monitor When the

                  chosen color palette is either Black amp White or 16 Colors these six bits are

                  obtained by mapping the value of the active pixel to a predefined RGB code

                  When the chosen color palette is 64 Colors the actual pixel value is used

                  directly

                  The register RGBREG is reset to zero (000000) when the VGA

                  Controller receives an external reset signal (RST) This RGB code represents

                  black

                  The register is updated on each rising edge of the CLK signal and the

                  value loaded is dependent on the state of the line display enable signal en

                  When en is Low blanking is required and RGBREG is loaded with the code

                  for black (000000)

                  Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                  and a 4-bit pixel (16 Colors mode) into the required RGB color code

                  25

                  26

                  27

                  The RGB color code stored in the RGB register is output from the VGA

                  Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                  and B1)

                  The monitor itself expects analog signals as inputs to its electron gun control

                  circuits This is achieved by using 2-bit digital to analog converter circuitry

                  located on the Nano Board itself as shown in Figure 4

                  Figure 91 Digital to Analog RGB conversion

                  28

                  For each color the 2-bit digital signal from the VGA Controller can be

                  converted into 4 distinct analog levels These levels specify the intensity of

                  each of the three primary colors to use when displaying the pixel on the

                  monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                  brightness)

                  With each analog input being one of four possible levels the monitor

                  can display each pixel on the screen with one of 64 different color

                  permutations

                  Using Block RAM in Spartan-3E Generation

                  FPGAs

                  For applications requiring large on-chip memories Spartantrade-3

                  Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                  Using various configuration options Select RAM blocks create RAM ROM

                  FIFOs large look-up tables data width converterscircular buffers and shift

                  registers each supporting various data widths and depths This application note

                  describes the features and capabilities of block Select RAM and illustrates how

                  to specify the various options using the Xilinx CORE Generatortrade system or

                  via VHDL or Verilog instantiation

                  29

                  Table 2 Block RAM Available in Spartan-3E Devices

                  Devi

                  ce

                  R

                  A

                  M

                  Co

                  lu

                  m

                  ns

                  R

                  A

                  M

                  Bl

                  oc

                  ks

                  pe

                  r

                  C

                  ol

                  u

                  m

                  n

                  T

                  o

                  t

                  a

                  l

                  R

                  A

                  M

                  B

                  l

                  o

                  c

                  k

                  s

                  T

                  o

                  t

                  a

                  l

                  R

                  A

                  M

                  B

                  it

                  s

                  T

                  o

                  t

                  a

                  l

                  R

                  A

                  M

                  K

                  b

                  i

                  t

                  s

                  XC3

                  S100

                  E

                  1 4 4 7

                  3

                  7

                  2

                  8

                  7

                  2

                  XC3

                  S250

                  E

                  2 6 1

                  2

                  2

                  2

                  1

                  1

                  8

                  4

                  2

                  1

                  6

                  XC3

                  S500

                  E

                  2 10 2

                  0

                  3

                  6

                  8

                  6

                  4

                  3

                  6

                  0

                  30

                  0

                  XC3

                  S120

                  0E

                  2 14 2

                  8

                  5

                  1

                  6

                  0

                  9

                  6

                  5

                  0

                  4

                  XC3

                  S160

                  0E

                  2 18 3

                  6

                  6

                  6

                  3

                  5

                  5

                  2

                  6

                  4

                  8

                  Each block RAM contains 18432 bits of fast static RAM 16K bits of

                  which is allocated to data storage and in some memory configurations an

                  additional 2K bits allocated to parity or additional plus data bits Physically

                  the block RAM memory has two completely independent access ports labeled

                  Port A and Port B The structure is fully symmetrical and both ports are

                  interchangeable and both ports support data read and write operations Each

                  memory port is synchronous with its own clock clock enable and write

                  enable Read operations are also synchronous and require a clock edge and

                  clock enable Though physically a dual-port memory block RAM simulates

                  single-port memory in an application as shown in Figure 1 Furthermore each

                  block memory supports multiple configurations or aspect ratios Table 3

                  summarizes the essential SelectRAM features Cascade multiple block RAMs

                  to create deeper and wider memory organizations with a minimal timing

                  penalty incurred through specialized routing resources

                  31

                  32

                  10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                  Why (V) HDL

                  Interoperability

                  Technology independence

                  Design reuse

                  Several levels of abstraction

                  Readability

                  Standard language

                  Widely supported

                  What is VHDL

                  VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                  IC)

                  Design specification language

                  Design entry language

                  Design simulation language

                  Design documentation language

                  An alternative to schematics

                  101 Brief History

                  VHDL Was developed in the early 1980s for managing design problems that

                  involved large circuits and multiple teams of engineers

                  Funded by US Department of Defence

                  33

                  The first publicly available version was released in 1985

                  In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                  presented with a proposal to standardize the VHDL

                  In 1987 standardization =gt IEEE 1076-1987

                  An improved version of the language was released in 1994 =gt IEEE

                  standard1076-1993

                  102 Related Standards

                  IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                  impedance

                  Soon after IEEE 1076-1987 was released simulator companies began using

                  their own non-standard types =gt VHDL was becoming a nonstandard

                  IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                  definitions for a nine-valued data type std_logic

                  IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                  to actual hardware

                  Defines eg two numeric types signed and unsigned

                  VHDL Environment

                  Design Units

                  34

                  Segments of VHDL code that can be compiled separately and stored in a

                  library

                  Entities

                  A black box with interface definition

                  Defines the inputsoutputs of a component (define pins)

                  A way to represent modularity in VHDL

                  Similar to symbol in schematic

                  Entity declaration describes entity

                  Eg

                  entity Comparator is

                  port (A B in std_logic_vector(7 downto0)

                  EQ out std_logic)

                  end Comparator

                  Ports Provide channels of communication between the component and its

                  environment

                  Each port must have a name direction and a type

                  An entity may have NO port declaration

                  35

                  Port directions In A value of a port can be read inside the component but cannot be assigned

                  Multiple reads of port are allowed

                  Out Assignments can be made to a port but data from a port cannot be read

                  Multiple assignments are allowed

                  In out Bi-directional assignments can be made and data can be read Multiple

                  assignments are allowed

                  Buffer An out port with read capability May have at most one assignment

                  (are not recommended)

                  Architectures Every entity has at least one architecture

                  One entity can have several architectures

                  Architectures can describe design using

                  BehaviorndashStructurendashDataflow

                  Architectures can describe design on many levelsndashGate levelndashRTL (Register

                  Transfer Level)ndashBehavioral level

                  Configuration declaration links architecture to entity

                  Eg

                  Architecture Comparator1 of Comparator is

                  Begin

                  EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                  End Comparator1

                  Configurations Links entity declaration and architecture body together

                  Concept of default configuration is a bit messy in VHDL lsquo87

                  ndashLast architecture analyzed links to entity

                  Can be used to change simulation behavior without re-analyzing the VHDL

                  source

                  Complex configuration declarations are ignored in synthesis

                  36

                  Some entities can have eggate level architecture and behavioral

                  architecture

                  Are always optional

                  PackagesPackages contain information common to many design units

                  1 Package declaration

                  --constant declarations

                  ndashtype and subtype declarations

                  ndashfunction and procedure declarations

                  ndashglobal signal declarations

                  ndashfile declarations

                  ndashcomponent declarations

                  2 Package body

                  ndashis not necessary needed

                  ndashfunction bodies

                  ndashprocedure bodies

                  Packages are meant for encapsuling data which can be shared globally among

                  Several design units

                  Consists of declaration part and optional body part

                  Package declaration can contain

                  ndashtype and subtype declarations

                  ndashsubprograms

                  ndashconstants

                  Alias declarations

                  ndashglobal signal declarations

                  ndashfile declarations

                  ndashcomponent declarations

                  37

                  Package body consists of

                  ndashsubprogram declarations and bodies

                  ndashtype and subtype declarations

                  ndash deferred constants

                  ndash file declarations

                  LibrariesCollection of VHDL design units (database)

                  1 Packages

                  package declaration

                  package body

                  2 Entities (entity declaration)

                  3 Architectures (architecture body)

                  4 Configurations (configuration declarations)

                  Usually directory in UNIX file system

                  Can be also any other kind of database

                  Levels of Abstraction

                  VHDL supports many possible styles of design description which differ

                  primarily in how closely they relate to the HW

                  It is possible to describe a circuit in a number of ways

                  Structural-------

                  Dataflow ------- Higher level of abstraction

                  Behavioral -------

                  Structural VHDL description

                  Circuit is described in terms of its components

                  From a low-level description (eg transistor-level description) to a high level

                  description (eg block diagram)

                  For large circuits a low-level description quickly becomes impractical

                  103 Dataflow VHDL Description

                  38

                  Circuit is described in terms of how data moves through the system

                  In the dataflow style you describe how information flows between registers in

                  the system

                  The combinational logic is described at a relatively high level the placement

                  and operation of registers is specified quite precisely

                  The behavior of the system over the time is defined by registers

                  There are no build-in registers in VHDL-language

                  ndashEither lower level description

                  ndashor behavioral description of sequential elements is needed

                  The lower level register descriptions must be created or obtained

                  If there is no 3rd party models for registers =gt you must write the behavioral

                  description of registers

                  The behavioral description can be provided in the form of

                  subprograms(functions or procedures)

                  104 Behavioral VHDL Description

                  Circuit is described in terms of its operation over time

                  39

                  Representation might include eg state diagrams timing diagrams and

                  algorithmic descriptions

                  The concept of time may be expressed precisely using delays (eg A lt= B

                  after 10 ns)

                  If no actual delay is used order of sequential operations is defined

                  In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                  timing specifications

                  The actual timing results depend on implementation technology and efficiency

                  of synthesis tool

                  There are a few tools for behavioral synthesis

                  Concurrent Vs Sequential

                  Processes

                  Basic simulation concept in VHDL

                  VHDL description can always be broken up to interconnected processes

                  Quite similar to Unix process

                  40

                  Process keyword in VHDL

                  Process statement is concurrent statement

                  Statements inside process statements are sequential statements

                  Process must contain either sensitivity list or wait statement(s) but NOT both

                  Sensitivity list or wait statement(s) contains signals which wakes process up

                  General format

                  Process [(sensitivity_list)]

                  process_declarative_part

                  Begin

                  process_statements

                  [wait_statement]

                  End process

                  41

                  11 VGA CONTROLLER CODE

                  library IEEE

                  use IEEESTD_LOGIC_1164ALL

                  use IEEESTD_LOGIC_ARITHALL

                  use IEEESTD_LOGIC_UNSIGNEDALL

                  use IEEEnumeric_stdALL

                  ---- Uncomment the following library declaration if instantiating

                  ---- any Xilinx primitives in this code

                  --library UNISIM

                  --use UNISIMVComponentsall

                  entity vga_controller is port(clk50min std_logic

                  rstnin std_logic

                  hsyncout std_logic

                  vsyncout std_logic

                  redout std_logic_vector(0 downto 0)

                  greenout std_logic_vector(0 downto 0)

                  blueout std_logic_vector(0 downto 0))

                  end vga_controller

                  architecture Behavioral of vga_controller is

                  component memory

                  port (

                  addr IN std_logic_VECTOR(17 downto 0)

                  clk IN std_logic

                  dout OUT std_logic_VECTOR(0 downto 0)

                  en IN std_logic

                  sinit IN std_logic)

                  END component

                  42

                  component sync_unit port(

                  clkin std_logic

                  rstnin std_logic

                  hsyncout std_logic

                  ---horizontal synch pulse(row)

                  vsyncout std_logic

                  --vertical sync pulse(frame) 111110100000000000

                  v_en_regout std_logic

                  h_enout std_logic

                  v_enout std_logic

                  line_countinout integer

                  c_hsinout integer)

                  end component

                  component pixelgen port(

                  clkin std_logic

                  rstnin std_logic

                  hsyncin std_logic

                  vsyncin std_logic

                  h_enin std_logic

                  v_enin std_logic

                  v_en_reg in std_logic

                  data_inin std_logic_vector(0 downto 0)

                  line_countin integer

                  c_hsin integer

                  red out std_logic_vector(0 downto 0)

                  greenout std_logic_vector(0 downto 0)

                  blueout std_logic_vector(0 downto 0)

                  addrout std_logic_vector(17 downto 0)

                  rdout std_logic)

                  end component

                  43

                  signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                  signal line_count_sc_hs_saddrinteger

                  signal data_sstd_logic_vector(0 downto 0)

                  signal addr_sstd_logic_vector(17 downto 0)

                  begin

                  --addr_s lt= std_logic_vector(to_unsigned(addr18))

                  process(clk50mrstn)

                  begin

                  if(rstn = 0)then

                  clk lt= 0

                  elsif(clk50m = 1 and clk50mevent)then

                  clk lt= not clk

                  end if

                  --end if

                  end process

                  hsync lt= hsync_s

                  vsync lt= vsync_s

                  sync sync_unit port map(

                  clk =gt clk

                  rstn =gt rstn

                  hsync =gt hsync_s

                  vsync =gt vsync_s

                  v_en_reg =gt v_en_reg_s

                  line_count =gt line_count_s

                  h_en =gt h_en_s

                  v_en =gt v_en_s

                  c_hs =gt c_hs_s

                  )

                  44

                  pixel pixelgen port map(

                  clk =gt clk

                  rstn =gt rstn

                  hsync =gt hsync_s

                  vsync =gt vsync_s

                  h_en =gt h_en_s

                  v_en =gt v_en_s

                  v_en_reg =gt v_en_reg_s

                  data_in =gt data_s

                  line_count=gt line_count_s

                  c_hs =gt c_hs_s

                  red =gt red

                  green =gt green

                  blue =gt blue

                  addr =gt addr_s

                  rd =gt rd_s

                  )

                  mem12 memory port map(

                  addr =gt addr_s

                  clk =gt clk

                  dout =gt data_s

                  en =gt rd_s

                  sinit=gt rstn

                  )

                  end Behavioral

                  45

                  SYNCHRONIZATION BLOCK CODE

                  library IEEE

                  use IEEESTD_LOGIC_1164ALL

                  use IEEESTD_LOGIC_ARITHALL

                  use IEEESTD_LOGIC_UNSIGNEDALL

                  entity sync_unit is port(

                  clkin std_logic

                  rstnin std_logic

                  hsyncout std_logic ---horizontal synch pulse(row)

                  vsyncout std_logic --vertical sync

                  pulse(frame) 111110100000000000

                  v_en_regout std_logic

                  h_enout std_logic

                  v_enout std_logic

                  line_countout integer

                  c_hsout integer )

                  end sync_unit

                  architecture arch_sync_unit of sync_unit is

                  signal h_en_sv_en_sstd_logic

                  signal line_count_sc_hs_sinteger

                  begin

                  c_hs lt= c_hs_s

                  line_count lt= line_count_s

                  h_en lt= h_en_s

                  v_en lt= v_en_s

                  -- hsync counter

                  process(clkrstn)

                  begin

                  if(rstn = 0)then

                  46

                  c_hs_s lt= 0

                  else

                  if(clk = 1 and clkevent)then

                  if(c_hs_s=793)then

                  c_hs_s lt= 0

                  else

                  c_hs_s lt= c_hs_s +1

                  end if

                  end if

                  end if

                  end process

                  --vsync counter

                  --vertical line counter

                  process(h_en_srstn)

                  begin

                  if(rstn = 0)then

                  line_count_s lt= 0

                  elsif(h_en_s = 1 and h_en_sevent)then

                  line_count_s lt= line_count_s + 1

                  if (line_count_s=515)then

                  line_count_s lt= 0

                  end if

                  end if

                  --end if

                  end process

                  --hysnc pulse

                  process(clkrstn)

                  begin

                  if(rstn = 0)then

                  47

                  hsync lt= 1

                  elsif(clk = 1 and clkevent)then

                  if(c_hs_s lt=95)then

                  hsync lt= 0

                  else

                  hsync lt= 1

                  end if

                  end if

                  --end if

                  end process

                  process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                  this period)

                  begin

                  if(rstn = 0)then

                  h_en_s lt= 0

                  elsif(clk = 1 and clkevent)then

                  if((c_hs_s gt=143) and (c_hs_slt=778))then

                  h_en_s lt= 1

                  else

                  h_en_s lt= 0

                  end if

                  end if

                  --end if

                  end process

                  ---vysnc pulse

                  process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                  begin

                  if(rstn = 0)then

                  48

                  v_en_s lt= 0

                  elsif(clk = 1 and clkevent)then

                  if((line_count_sgt=35 and line_count_slt=515))then

                  v_en_s lt= 1

                  else

                  v_en_s lt= 0

                  end if

                  end if

                  --end if

                  end process

                  process(clkrstn) --Vertical Sync Front porch Back Porch

                  begin

                  if(rstn = 0)then

                  vsync lt= 1

                  elsif(clk = 1 and clkevent)then

                  if(line_count_s lt2)then

                  vsync lt= 0

                  else

                  vsync lt= 1

                  end if

                  end if

                  --end if

                  end process

                  process(clk)

                  begin

                  if(clk = 1 and clkevent)then

                  v_en_reglt=v_en_s

                  end if

                  end process

                  end arch_sync_unit

                  49

                  PIXEL GENERATOR

                  library IEEE

                  use IEEESTD_LOGIC_1164ALL

                  use IEEESTD_LOGIC_ARITHALL

                  use IEEESTD_LOGIC_UNSIGNEDALL

                  entity pixelgen is port(

                  clkin std_logic

                  rstnin std_logic

                  hsyncin std_logic

                  vsyncin std_logic

                  h_enin std_logic

                  v_enin std_logic

                  v_en_reg in std_logic

                  data_inin std_logic_vector(0 downto 0)

                  line_countin integer

                  c_hsin integer

                  red out std_logic_vector(0 downto 0)

                  greenout std_logic_vector(0 downto 0)

                  blueout std_logic_vector(0 downto 0)

                  addrout std_logic_vector(17 downto 0)

                  rdout std_logic

                  )

                  end pixelgen

                  architecture behv of pixelgen is

                  signal addr_sstd_logic_vector(17 downto 0)

                  begin

                  addr lt= addr_s

                  process(clkrstn)

                  begin

                  50

                  if(rstn = 0)then

                  rd lt= 0

                  red lt= 0

                  green lt= 0

                  blue lt= 0

                  addr_s lt= x0000 amp 00

                  elsif (clk = 1 and clkevent)then

                  if((v_en=1) and (v_en_reg=0))then

                  addr_slt=(others =gt0)

                  end if

                  if(v_en=1 and h_en=1)then

                  if(line_countlt=514)then

                  if(c_hs lt= 655)then

                  rd lt= 1

                  red lt= data_in

                  green lt= data_in

                  blue lt= data_in

                  addr_s lt=addr_s +000000000000000001

                  else

                  rd lt= 0

                  red lt= 0

                  green lt= 1

                  blue lt= 0

                  end if

                  else

                  rd lt= 0

                  red lt= 1

                  green lt= 1

                  blue lt= 1

                  51

                  end if

                  if(addr_s=111011111111111111)then

                  addr_slt= (others =gt0)

                  end if

                  else

                  red lt= 0

                  green lt= 0

                  blue lt= 0

                  end if

                  end if

                  -- end if

                  --end if

                  end process

                  end behv

                  52

                  12 FPGA ARCHITECTURE

                  A field-programmable gate array (FPGA) is a semiconductor device that can

                  be configured by the customer or designer after manufacturingmdashhence the

                  name field-programmable FPGAs are programmed using a logic circuit

                  diagram or a source code in a hardware description language (HDL) to specify

                  how the chip will work They can be used to implement any logical function

                  that an application-specific integrated circuit (ASIC) could perform but the

                  ability to update the functionality after shipping offers advantages for many

                  applications

                  FPGAs contain programmable logic components called logic blocks and a

                  hierarchy of reconfigurable interconnects that allow the blocks to be wired

                  togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                  can be configured to perform complex combinational functions or merely

                  simple logic gates like AND and XOR In most FPGAs the logic blocks also

                  include memory elements which may be simple flip-flops or more complete

                  blocks of memory

                  History The FPGA industry sprouted from programmable read only

                  memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                  both had the option of being programmed in batches in a factory or in the field

                  (field programmable) however programmable logic was hard-wired between

                  logic gates

                  Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                  first commercially viable field programmable gate array in 1985 ndash the

                  XC2064 The XC2064 had programmable gates and programmable

                  interconnects between gates the beginnings of a new technology and market

                  The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                  input lookup tables (LUTs) More than 20 years later Freeman was entered

                  into the National Inventors Hall of Fame for his invention

                  53

                  Some of the industryrsquos foundational concepts and technologies for

                  programmable logic arrays gates and logic blocks are founded in patents

                  awarded to David W Page and LuVerne R Peterson in 1985

                  In the late 1980s the Naval Surface Warfare Department funded an experiment

                  proposed by Steve Casselman to develop a computer that would implement

                  600000 reprogrammable gates Casselman was successful and the system was

                  awarded a patent in 1992

                  Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                  1990s when competitors sprouted up eroding significant market-share By

                  1993 Actel was serving about 18 percent of the market

                  The 1990s were an explosive period of time for FPGAs both in sophistication

                  and the volume of production In the early 1990s FPGAs were primarily used

                  in telecommunications and networking By the end of the decade FPGAs

                  found their way into consumer automotive and industrial applications

                  FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                  algorithm technology and FPGAs to create a sound recognition device

                  Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                  to decide the configuration needed to accomplish a sound recognition task

                  121 Modern developments

                  A recent trend has been to take the coarse-grained architectural

                  approach a step further by combining the logic blocks and interconnects of

                  traditional FPGAs with embedded microprocessors and related peripherals to

                  form a complete system on a programmable chip This work mirrors the

                  architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                  Group which combined a reconfigurable CPU architecture on a single chip

                  called the SB24 That work was done in 1982 Examples of such hybrid

                  technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                  which include one or more PowerPC processors embedded within the FPGAs

                  54

                  logic fabric The Atmel FPSLIC is another such device which uses an AVR

                  processor in combination with Atmels programmable logic architecture

                  An alternate approach to using hard-macro processors is to make use of soft

                  processor cores that are implemented within the FPGA logic (See Soft

                  processors below)

                  As previously mentioned many modern FPGAs have the ability to be

                  reprogrammed at run time and this is leading to the idea of reconfigurable

                  computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                  suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                  example of a reconfigurable soft processor implemented on FPGAs However

                  it does not support dynamic reconfiguration at runtime but instead adapts itself

                  to a specific program

                  Additionally new non-FPGA architectures are beginning to emerge

                  Software-configurable microprocessors such as the Stretch S5000 adopt a

                  hybrid approach by providing an array of processor cores and FPGA-like

                  programmable cores on the same chip

                  Gates

                  1987 9000 gates Xilinx

                  1992 600000 Naval Surface Warfare Department

                  Early 2000s Millions

                  Market size

                  1985 First commercial FPGA technology invented by Xilinx

                  1987 $14 million

                  ~1993 gt$385 million

                  2005 $19 billion

                  2010 estimates $275 billion

                  55

                  122 FPGA Comparisons

                  Historically FPGAs have been slower less energy efficient and generally

                  achieved less functionality than their fixed ASIC counterparts A combination

                  of volume fabrication improvements research and development and the IO

                  capabilities of new supercomputers have largely closed the performance gap

                  between ASICs and FPGAs

                  Advantages include a shorter time to market ability to re-program in the field

                  to fix bugs and lower non-recurring engineering costs Vendors can also take a

                  middle road by developing their hardware on ordinary FPGAs but

                  manufacture their final version so it can no longer be modified after the design

                  has been committed

                  Xilinx claims that several market and technology dynamics are changing the

                  ASICFPGA paradigm

                  IC costs are rising aggressively

                  ASIC complexity has bolstered development time and costs

                  RampD resources and headcount is decreasing

                  Revenue losses for slow time-to-market are increasing

                  Financial constraints in a poor economy are driving low-cost technologies

                  These trends make FPGAs a better alternative than ASICs for a growing

                  number of higher-volume applications than they have been historically used

                  for which the company blames for the growing number of FPGA design starts

                  (see History)

                  The primary differences between CPLDs and FPGAs are architectural A

                  CPLD has a somewhat restrictive structure consisting of one or more

                  programmable sum-of-products logic arrays feeding a relatively small number

                  of clocked registers The result of this is less flexibility with the advantage of

                  more predictable timing delays and a higher logic-to-interconnect ratio The

                  FPGA architectures on the other hand are dominated by interconnect This

                  56

                  makes them far more flexible (in terms of the range of designs that are

                  practical for implementation within them) but also far more complex to design

                  for

                  Another notable difference between CPLDs and FPGAs is the presence in most

                  FPGAs of higher-level embedded functions (such as adders and multipliers)

                  and embedded memories as well as to have logic blocks implement decoders

                  or mathematical functions

                  Some FPGAs have the capability of partial re-configuration that lets one

                  portion of the device be re-programmed while other portions continue running

                  123 Applications

                  Applications of FPGAs include digital signal processing software-defined

                  radio aerospace and defense systems ASIC prototyping medical imaging

                  computer vision speech recognition cryptography bioinformatics computer

                  hardware emulation radio astronomy and a growing range of other areas

                  FPGAs originally began as competitors to CPLDs and competed in a similar

                  space that of glue logic for PCBs As their size capabilities and speed

                  increased they began to take over larger and larger functions to the state where

                  some are now marketed as full systems on chips (SoC) Particularly with the

                  introduction of dedicated multipliers into FPGA architectures in the late 1990s

                  applications which had traditionally been the sole reserve of DSPs began to

                  incorporate FPGAs instead

                  FPGAs especially find applications in any area or algorithm that can make use

                  of the massive parallelism offered by their architecture One such area is code

                  breaking in particular brute-force attack of cryptographic algorithms

                  FPGAs are increasingly used in conventional high performance computing

                  applications where computational kernels such as FFT or Convolution are

                  performed on the FPGA instead of a microprocessor

                  57

                  The inherent parallelism of the logic resources on an FPGA allows for

                  considerable computational throughput even at a low MHz clock rates The

                  flexibility of the FPGA allows for even higher performance by trading off

                  precision and range in the number format for an increased number of parallel

                  arithmetic units This has driven a new type of processing called

                  reconfigurable computing where time intensive tasks are offloaded from

                  software to FPGAs

                  The adoption of FPGAs in high performance computing is currently limited by

                  the complexity of FPGA design compared to conventional software and the

                  extremely long turn-around times of current design tools where 4-8 hours wait

                  is necessary after even minor changes to the source code

                  Traditionally FPGAs have been reserved for specific vertical applications

                  where the volume of production is small For these low-volume applications

                  the premium that companies pay in hardware costs per unit for a

                  programmable chip is more affordable than the development resources spent

                  on creating an ASIC for a low-volume application Today new cost and

                  performance dynamics have broadened the range of viable applications

                  Architecture

                  The most common FPGA architecture consists of an array of configurable

                  logic blocks (CLBs) IO pads and routing channels Generally all the routing

                  channels have the same width (number of wires) Multiple IO pads may fit

                  into the height of one row or the width of one column in the array

                  An application circuit must be mapped into an FPGA with adequate resources

                  While the number of CLBs and IOs required is easily determined from the

                  design the number of routing tracks needed may vary considerably even

                  among designs with the same amount of logic (For example a crossbar switch

                  requires much more routing than a systolic array with the same gate count)

                  Since unused routing tracks increase the cost (and decrease the performance)

                  of the part without providing any benefit FPGA manufacturers try to provide

                  58

                  just enough tracks so that most designs that will fit in terms of LUTs and IOs

                  can be routed This is determined by estimates such as those derived from

                  Rents rule or by experiments with existing designs

                  The FPGA is an array or island-style FPGA It consists of an array of logic

                  blocks and routing channels Two IO pads fit into the height of one row or the

                  width of one column as shown below All the routing channels have the same

                  width (number of wires)

                  121 FPGA structure

                  A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                  flip-flop as shown below In recent years manufacturers have started moving

                  to 6-input LUTs in their high performance parts claiming increased

                  performance

                  Typical logic block

                  59

                  There is only one output which can be either the registered or the unregistered

                  LUT output The logic block has four inputs for the LUT and a clock input

                  Since clock signals (and often other high-fanout signals) are normally routed

                  via special-purpose dedicated routing networks in commercial FPGAs they

                  and other signals are separately managed

                  For this example architecture the locations of the FPGA logic block pins are

                  shown below

                  Logic Block Pin Locations

                  Each input is accessible from one side of the logic block while the output pin

                  can connect to routing wires in both the channel to the right and the channel

                  below the logic block

                  Each logic block output pin can connect to any of the wiring segments in the

                  channels adjacent to it

                  Similarly an IO pad can connect to any one of the wiring segments in the

                  channel adjacent to it For example an IO pad at the top of the chip can

                  connect to any of the W wires (where W is the channel width) in the horizontal

                  channel immediately below it

                  Generally the FPGA routing is unsegmented That is each wiring segment

                  spans only one logic block before it terminates in a switch box By turning on

                  60

                  some of the programmable switches within a switch box longer paths can be

                  constructed For higher speed interconnect some FPGA architectures use

                  longer routing lines that span multiple logic blocks

                  Whenever a vertical and a horizontal channel intersect there is a switch box In

                  this architecture when a wire enters a switch box there are three

                  programmable switches that allow it to connect to three other wires in adjacent

                  channel segments The pattern or topology of switches used in this

                  architecture is the planar or domain-based switch box topology In this switch

                  box topology a wire in track number one connects only to wires in track

                  number one in adjacent channel segments wires in track number 2 connect

                  only to other wires in track number 2 and so on The figure below illustrates

                  the connections in a switch box

                  Switch box topology

                  Modern FPGA families expand upon the above capabilities to include higher

                  level functionality fixed into the silicon Having these common functions

                  embedded into the silicon reduces the area required and gives those functions

                  increased speed compared to building them from primitives Examples of these

                  include multipliers generic DSP blocks embedded processors high speed IO

                  logic and embedded memories

                  61

                  FPGAs are also widely used for systems validation including pre-silicon

                  validation post-silicon validation and firmware development This allows chip

                  companies to validate their design before the chip is produced in the factory

                  reducing the time to market

                  124 FPGA Design and Programming

                  To define the behavior of the FPGA the user provides a hardware description

                  language (HDL) or a schematic design The HDL form might be easier to work

                  with when handling large structures because its possible to just specify them

                  numerically rather than having to draw every piece by hand On the other hand

                  schematic entry can allow for easier visualisation of a design

                  Then using an electronic design automation tool a technology-mapped netlist

                  is generated The netlist can then be fitted to the actual FPGA architecture

                  using a process called place-and-route usually performed by the FPGA

                  companys proprietary place-and-route software The user will validate the

                  map place and route results via timing analysis simulation and other

                  verification methodologies Once the design and validation process is

                  complete the binary file generated (also using the FPGA companys

                  proprietary software) is used to (re)configure the FPGA

                  Going from schematicHDL source files to actual configuration The source

                  files are fed to a software suite from the FPGACPLD vendor that through

                  different steps will produce a file This file is then transferred to the

                  FPGACPLD via a serial interface (JTAG) or to an external memory device

                  like an EEPROM

                  The most common HDLs are VHDL and Verilog although in an attempt to

                  reduce the complexity of designing in HDLs which have been compared to the

                  equivalent of assembly languages there are moves to raise the abstraction level

                  through the introduction of alternative languages

                  62

                  To simplify the design of complex systems in FPGAs there exist libraries of

                  predefined complex functions and circuits that have been tested and optimized

                  to speed up the design process These predefined circuits are commonly called

                  IP cores and are available from FPGA vendors and third-party IP suppliers

                  (rarely free and typically released under proprietary licenses) Other

                  predefined circuits are available from developer communities such as Open

                  Cores (typically free and released under the GPL BSD or similar license) and

                  other sources

                  In a typical design flow an FPGA application developer will simulate the

                  design at multiple stages throughout the design process Initially the RTL

                  description in VHDL or Verilog is simulated by creating test benches to

                  simulate the system and observe results Then after the synthesis engine has

                  mapped the design to a netlist the netlist is translated to a gate level

                  description where simulation is repeated to confirm the synthesis proceeded

                  without errors Finally the design is laid out in the FPGA at which point

                  propagation delays can be added and the simulation run again with these

                  values back-annotated onto the netlist

                  Basic Process Technology Types

                  SRAM - based on static memory technology In-system programmable and re-

                  programmable Requires external boot devices CMOS

                  Antifuse - One-time programmable CMOS

                  EPROM - Erasable Programmable Read-Only Memory technology Usually

                  one-time programmable in production because of plastic packaging

                  Windowed devices can be erased with ultraviolet (UV) light CMOS

                  EEPROM - Electrically Erasable Programmable Read-Only Memory

                  technology Can be erased even in plastic packages Some but not all

                  EEPROM devices can be in-system programmed CMOS

                  Flash - Flash-erase EPROM technology Can be erased even in plastic

                  packages Some but not all flash devices can be in-system programmed

                  63

                  Usually a flash cell is smaller than an equivalent EEPROM cell and is

                  therefore less expensive to manufacture CMOS

                  Fuse - One-time programmable Bipolar

                  Major Manufacturers

                  Xilinx and Altera are the current FPGA market leaders and long-time industry

                  rivals Together they control over 80 percent of the market with Xilinx alone

                  representing over 50 percent

                  Xilinx also provides free Windows and Linux design software while Altera

                  provides free Windows tools the Solaris and Linux tools are only available via

                  a rental scheme

                  Other competitors include Lattice Semiconductor (flash SRAM) Actel

                  (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                  Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                  focused CSSP no general purpose FPGAs)

                  125 FPGA prototype

                  FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                  prototyping is the method to prototype SoC and ASIC design on FPGA for

                  hardware verification and early software development

                  Main stream verification methods for hardware design and early software and

                  firmware co-design has become mainstream Prototyping SoC and ASIC

                  design on FPGA has become a good method to do this

                  64

                  Reasons why Prototyping is important

                  1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                  functionally correct This is compared to designers only relying on software

                  simulations to verify that their hardware design is sound Simulation speed and

                  modeling accuracy limitations hinder this development

                  2 Due to time constrains many projects cannot wait until the silicon is back from

                  the foundry to start on software tests FPGA prototyping allows for much more

                  time in area of software development and testing at the software-hardware

                  integration stage This allows many unforeseen software bugs that appear due

                  to todays array of operating systems applications and hardware

                  3 Prototyping also allows the developer to ensure that all IP technologies on his

                  system work well together off the simulation stage and in actual form

                  4 Prototyping has the added advantage as demo platforms to SoC clients

                  bringing in interest early This speeds up the overall development cycle and

                  allows for more enhancement or improvement to the chip features as it would

                  otherwise have been

                  65

                  13 SIMULATION RESULTS

                  66

                  GATE LEVEL

                  FigVGA CONTROLLER

                  67

                  GATE LEVEL

                  Fig SYNC UNIT

                  68

                  TECHNOLOGY SCHEMATIC

                  69

                  14 APPLICATIONS

                  Motion Tracking Systems

                  Image processing systems

                  Displaying Systems

                  141 CONCLUTION

                  Hence by implementing FPGA taking a processor-generated picture

                  (pixilated) from memory space the Controller provides digital RGB values for

                  each pixel as well as horizontal and vertical synchronization signals in order

                  to correctly display the picture on a connected monitor

                  70

                  REFERENCES

                  1 httpenwikipediaorgwikiVideo_Graphics_Array

                  2 A VGA display controller by Eduardo Sanchez

                  3 HinnerrdquoVGA Interface and video signal documentsrdquo

                  71

                  • Pixel
                    • 6 SIGNALS IN VGA CONTROLLER
                    • Gates
                    • Market size
                    • 122 FPGA Comparisons

                    The color generated for a pixel in the Pixel Unit depends on whether

                    the particular pixel requires to be blanked or not The Synchronization Unit

                    provides a signal to the Pixel Unit for this very reason This is the line display

                    enable signal - en The signal is checked on each rising edge of the external

                    clock signal (CLK) and is set as follows

                    If (HCOUNT ge PixTotal) or (VCOUNT ge LinTotal) then

                    en = 0 (pixel requires to be blanked ndash set color to be black)

                    Else

                    en = 1 (pixel is a viewable pixel ndash generate RGB color accordingly)

                    6 SIGNALS IN VGA CONTROLLER

                    For most common VGA mode 640x480 60 Hz non-interlaced the horizontal

                    timings are

                    Parameter Value Unit

                    Clock frequency 25175 MHz

                    Horizontal pixels 640

                    Horizontal sync polarity Negative

                    Total time for each line 3177 micros

                    Front porch (A) 094 micros

                    Sync pulse length (B) 377 micros

                    Back porch (C) 189 micros

                    Active video (D) 2517 micros

                    (Total horizontal sync time 660 micros)

                    10

                    61 The vertical timings are

                    Parameter Value Unit

                    Vertical lines 480

                    Vertical sync polarity Negative

                    Vertical frequency 5994 Hz

                    Front porch (E) 035 ms

                    Sync pulse length (F) 006 ms

                    Back porch (G) 102 ms

                    Active video (H) 1525 ms

                    (Total vertical sync time 143 ms)

                    640 x 400 70 Hz is video mode used for booting most x86 personal

                    computers

                    640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

                    It should be noted that the actual timings vary slightly For example for

                    640x480 60fps a 2517 micros active video time with a pixel frequency of

                    25174 MHz gives 633 pixels rather than the expected 640 pixels

                    62 Generation of the horizontal synchronization signal ndash HSYNC

                    11

                    The HSYNC signal is High (inactive) after an external reset signal

                    (RST) is received by the VGA Controller The signal is updated on each rising

                    edge of the external clock signal (CLK)

                    The state of the HSYNC signal depends on the value stored in the horizontal

                    counter and is driven low when

                    HCOUNT ge (PixTotal + BlankingLeft)

                    and remains low while

                    HCOUNT lt (PixTotal + BlankingLeft + TLSync)

                    Vertical (Frame) Period

                    63 Generation of the vertical synchronization signal - VSYNC

                    The VSYNC signal is High (inactive) after an external reset signal

                    (RST) is received by the VGA Controller The signal is updated after every

                    line of pixels is completed (ie on each rising edge of the HSYNC signal)

                    The state of the VSYNC signal depends on the value stored in the vertical

                    counter and is driven low when

                    VCOUNT ge (LinTotal + BlankingLeft)

                    and remains low while

                    VCOUNT lt (LinTotal + BlankingLeft + TFSync)

                    Address Counter (MEMADDR)

                    The address counter is used to store the position of the next consecutive

                    pixel in the frame Its value is passed to the Pixel Unit on the internal bus

                    signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

                    the next pixel from picture memory

                    The counter is reset to zero when the VGA Controller receives an external

                    reset signal (RST) The size of the counter depends on the values chosen for

                    DISPSIZE_H and DISPSIZE_V as the range is simply

                    12

                    0 to (PixTotal x LinTotal) - 1

                    Taking the maximum number of pixels in a line and lines in a frame for each

                    of the supported display resolutions the maximum ranges for the counter are

                    bull 640x480 0 to 307199

                    bull 800x600 0 to 479999

                    The counter has 19-bit resolution

                    While the value in the horizontal counter (HCOUNT) is less than the

                    total number of viewable pixels in a line (PixTotal the integer value of

                    DISPSIZE_H) the counter is incremented on the rising edge of the external

                    clock signal (CLK) Pixel addressing within the frame is consecutive When

                    the counter reaches the last pixel in a line its incremented value is the first

                    pixel in the next line down

                    The address counter will continue to be incremented until the value in the

                    vertical counter (VCOUNT) is greater than or equal to the total number of

                    viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

                    point it will be rolled over to zero

                    13

                    7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

                    microcontroller and any VGA-compatible monitor This variant of the

                    Controller provides six modes of display depending on the resolution chosen

                    (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

                    White 16 Colors or 64 Colors)

                    71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

                    For 1sec = 60 frames

                    1 frame = max 160 sec

                    1 pixel = 160 sec = 50ns

                    307200

                    Max Time for 1 pixel = 40ns

                    2) To send information for each and every pixel of a frame by means of

                    RGB signals

                    3) After completely displaying 1 row information the column must be

                    refreshed and the row has to be incremented ie done by Hsync signal

                    4) After completing 1 frame of information a new frame has to be started

                    for this row and column are to be refreshed ie done by Vsync signal

                    14

                    =gt For giving information for 1 pixel we use 1 clock (for controlling all

                    operations)

                    1 clock = 1 pixel

                    1 clock cycle period = 40ns

                    Clock frequency = 1time period =140ns = 25 mhz

                    1 clock frequency = 25 mhz

                    So if we use 25mhz clock and display 1 pixel information for each and enery

                    clock then we use can attain min 60hz refreshing rate

                    do not use the clock frequency less than 25mhz

                    Ex

                    15MHz 25MHz

                    30MHz

                    Time period = more effectively utilized frequency

                    Less

                    No of framessec = less

                    More

                    Decreases

                    Increases

                    (It is unable to display 60framessec)

                    Signals used for image display-

                    R (Red)

                    15

                    G (Green)

                    B (Blue)

                    HSYNC (Horizontal Synchronous signal)

                    VSYNC (Vertical Synchronous signal)

                    All the above signals are used to control the image data to be correctly

                    displayed on the VGA display monitor

                    72 Timings for front port sync pulse back port-

                    Horizontal sync Vertical sync

                    Front port 094micros Front port 035ms

                    Sync pulse 377 micros Sync pulse 0063ms

                    Back port 189 micros Back port 102ms

                    16

                    8 FUNCTIONAL DESCRIPTION81 Symbol

                    VGA Controller Symbol

                    Pin description

                    The pin description is shown in the below table

                    Table VGA Pin description

                    Name Type Polarity Bus size Description

                    Global Control Signals

                    CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                    17

                    640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                    RST I High Global system reset

                    VGA Input Settings Signals

                    RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                    CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                    DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                    DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                    Data Memory Control Signals

                    RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                    18

                    the internal line enable signal en generated by the Synchronization Unit of the Controller

                    DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                    ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                    VGA Monitor Control Signals

                    HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                    VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                    19

                    VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                    R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                    G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                    S

                    B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                    20

                    9 VGA SIGNAL TIMING

                    Figure 3 summarizes the signal timing involved in sending a line of

                    pixels and a frame of lines The actual time values differ according to the

                    resolution selected (640x480 or 800x600) the processor-defined values for

                    DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                    (CLK)

                    21

                    91 VGA PIXEL UNIT

                    The Pixel Unit provides access to the pixilated image stored in external

                    picture memory reading in data a byte at a time and formatting each pixel to

                    be displayed For each pixel the 6-bit RGB value required for the control of

                    the monitorrsquos three primary color electron guns is generated so that the pixel

                    is displayed on the screen with the correct color

                    Accessing the stored image

                    The image to be displayed on the monitor screen is written by the host

                    microcontroller into external memory space (RAM) This memory space can

                    be located anywhere (eg using a block of RAM within the FPGA design or

                    using the SRAM on the NanoBoard itself)

                    Pixel data is stored in the memory space in bytes The number of pixels in a

                    byte depends on which color palette is being used for the image (selected by

                    the CMOD input)

                    bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                    bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                    bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                    (DATA[76] are not used)

                    22

                    The size of memory required to store a picture is determined by the total

                    number of viewable pixels in a line (determined by DISPSIZE_H) the total

                    number of viewable lines in a frame (determined by DISPSIZE_V) and the

                    number of pixels stored in each byte in memory space

                    Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                    The address in RAM where the next pixel is stored is determined using

                    an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                    reflects the current contents of the MEMADDR register The exact addressing

                    is described below

                    92 Black amp White mode

                    The picture memory address ndash the byte of data containing the next 8 pixels

                    ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                    three

                    ADDR_PIXEL = 00 amp hvcnt[183]

                    16 Colors mode

                    The picture memory address ndash the byte of data containing the next 2 pixels

                    ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                    one

                    ADDR_PIXEL = 0 amp hvcnt[181]

                    64 Colors mode

                    The picture memory address ndash the byte of data containing the next pixel ndash is

                    determined by using the full value of hvcnt

                    ADDR_PIXEL = hvcnt[180]

                    The Pixel register (PIXREG)

                    The Pixel register is used to receive the byte of pixel data read from the

                    current address in memory space The register PIXREG is reset to zero when

                    the VGA Controller receives an external reset signal (RST)

                    23

                    The register is updated on each rising edge of the CLK signal Data can be

                    read from the memory space as long as the RD signal is active (High) The RD

                    signal is itself controlled by the external line display enable signal enex This

                    internally generated signal is defined as follows

                    If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                    (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                    Else

                    enex = 1

                    When enex is Low read access from memory is disabled (RD = 0)

                    The point at which data is loaded from memory into PIXREG depends on the

                    particular color palette that is chosen ndash Black amp White 16 Colors or 64

                    Colors

                    Black amp White mode

                    The next byte of data will be loaded into the register whenever the

                    lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                    are ldquo000rdquo

                    For the currently loaded byte the active pixel is always in the lowest bit

                    position of the Pixel register Each pixel in the data byte is moved into this

                    active pixel position by shifting the contents of the register right by one bit on

                    each rising edge of CLK

                    16 Colors mode

                    The next byte of data will be loaded into the register whenever the

                    lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                    0

                    For the currently loaded byte the active pixel is always in the low order

                    nibble of the Pixel register Remember that in this mode each byte of data

                    contains two pixels The second pixel is moved into this active pixel position

                    by shifting the contents of the register right by four bits on the rising edge of

                    CLK

                    24

                    64 Colors mode

                    The next byte of data will be loaded into the register on the rising edge

                    of the external system clock signal (CLK) In this mode the read of pixel data

                    does not depend on the status of the hvcnt signal received from the

                    Synchronization Unit

                    93 The RGB register (RGBREG)

                    The RGB register is used to store the six bits that are required for

                    driving the red green and blue color guns of the target monitor When the

                    chosen color palette is either Black amp White or 16 Colors these six bits are

                    obtained by mapping the value of the active pixel to a predefined RGB code

                    When the chosen color palette is 64 Colors the actual pixel value is used

                    directly

                    The register RGBREG is reset to zero (000000) when the VGA

                    Controller receives an external reset signal (RST) This RGB code represents

                    black

                    The register is updated on each rising edge of the CLK signal and the

                    value loaded is dependent on the state of the line display enable signal en

                    When en is Low blanking is required and RGBREG is loaded with the code

                    for black (000000)

                    Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                    and a 4-bit pixel (16 Colors mode) into the required RGB color code

                    25

                    26

                    27

                    The RGB color code stored in the RGB register is output from the VGA

                    Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                    and B1)

                    The monitor itself expects analog signals as inputs to its electron gun control

                    circuits This is achieved by using 2-bit digital to analog converter circuitry

                    located on the Nano Board itself as shown in Figure 4

                    Figure 91 Digital to Analog RGB conversion

                    28

                    For each color the 2-bit digital signal from the VGA Controller can be

                    converted into 4 distinct analog levels These levels specify the intensity of

                    each of the three primary colors to use when displaying the pixel on the

                    monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                    brightness)

                    With each analog input being one of four possible levels the monitor

                    can display each pixel on the screen with one of 64 different color

                    permutations

                    Using Block RAM in Spartan-3E Generation

                    FPGAs

                    For applications requiring large on-chip memories Spartantrade-3

                    Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                    Using various configuration options Select RAM blocks create RAM ROM

                    FIFOs large look-up tables data width converterscircular buffers and shift

                    registers each supporting various data widths and depths This application note

                    describes the features and capabilities of block Select RAM and illustrates how

                    to specify the various options using the Xilinx CORE Generatortrade system or

                    via VHDL or Verilog instantiation

                    29

                    Table 2 Block RAM Available in Spartan-3E Devices

                    Devi

                    ce

                    R

                    A

                    M

                    Co

                    lu

                    m

                    ns

                    R

                    A

                    M

                    Bl

                    oc

                    ks

                    pe

                    r

                    C

                    ol

                    u

                    m

                    n

                    T

                    o

                    t

                    a

                    l

                    R

                    A

                    M

                    B

                    l

                    o

                    c

                    k

                    s

                    T

                    o

                    t

                    a

                    l

                    R

                    A

                    M

                    B

                    it

                    s

                    T

                    o

                    t

                    a

                    l

                    R

                    A

                    M

                    K

                    b

                    i

                    t

                    s

                    XC3

                    S100

                    E

                    1 4 4 7

                    3

                    7

                    2

                    8

                    7

                    2

                    XC3

                    S250

                    E

                    2 6 1

                    2

                    2

                    2

                    1

                    1

                    8

                    4

                    2

                    1

                    6

                    XC3

                    S500

                    E

                    2 10 2

                    0

                    3

                    6

                    8

                    6

                    4

                    3

                    6

                    0

                    30

                    0

                    XC3

                    S120

                    0E

                    2 14 2

                    8

                    5

                    1

                    6

                    0

                    9

                    6

                    5

                    0

                    4

                    XC3

                    S160

                    0E

                    2 18 3

                    6

                    6

                    6

                    3

                    5

                    5

                    2

                    6

                    4

                    8

                    Each block RAM contains 18432 bits of fast static RAM 16K bits of

                    which is allocated to data storage and in some memory configurations an

                    additional 2K bits allocated to parity or additional plus data bits Physically

                    the block RAM memory has two completely independent access ports labeled

                    Port A and Port B The structure is fully symmetrical and both ports are

                    interchangeable and both ports support data read and write operations Each

                    memory port is synchronous with its own clock clock enable and write

                    enable Read operations are also synchronous and require a clock edge and

                    clock enable Though physically a dual-port memory block RAM simulates

                    single-port memory in an application as shown in Figure 1 Furthermore each

                    block memory supports multiple configurations or aspect ratios Table 3

                    summarizes the essential SelectRAM features Cascade multiple block RAMs

                    to create deeper and wider memory organizations with a minimal timing

                    penalty incurred through specialized routing resources

                    31

                    32

                    10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                    Why (V) HDL

                    Interoperability

                    Technology independence

                    Design reuse

                    Several levels of abstraction

                    Readability

                    Standard language

                    Widely supported

                    What is VHDL

                    VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                    IC)

                    Design specification language

                    Design entry language

                    Design simulation language

                    Design documentation language

                    An alternative to schematics

                    101 Brief History

                    VHDL Was developed in the early 1980s for managing design problems that

                    involved large circuits and multiple teams of engineers

                    Funded by US Department of Defence

                    33

                    The first publicly available version was released in 1985

                    In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                    presented with a proposal to standardize the VHDL

                    In 1987 standardization =gt IEEE 1076-1987

                    An improved version of the language was released in 1994 =gt IEEE

                    standard1076-1993

                    102 Related Standards

                    IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                    impedance

                    Soon after IEEE 1076-1987 was released simulator companies began using

                    their own non-standard types =gt VHDL was becoming a nonstandard

                    IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                    definitions for a nine-valued data type std_logic

                    IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                    to actual hardware

                    Defines eg two numeric types signed and unsigned

                    VHDL Environment

                    Design Units

                    34

                    Segments of VHDL code that can be compiled separately and stored in a

                    library

                    Entities

                    A black box with interface definition

                    Defines the inputsoutputs of a component (define pins)

                    A way to represent modularity in VHDL

                    Similar to symbol in schematic

                    Entity declaration describes entity

                    Eg

                    entity Comparator is

                    port (A B in std_logic_vector(7 downto0)

                    EQ out std_logic)

                    end Comparator

                    Ports Provide channels of communication between the component and its

                    environment

                    Each port must have a name direction and a type

                    An entity may have NO port declaration

                    35

                    Port directions In A value of a port can be read inside the component but cannot be assigned

                    Multiple reads of port are allowed

                    Out Assignments can be made to a port but data from a port cannot be read

                    Multiple assignments are allowed

                    In out Bi-directional assignments can be made and data can be read Multiple

                    assignments are allowed

                    Buffer An out port with read capability May have at most one assignment

                    (are not recommended)

                    Architectures Every entity has at least one architecture

                    One entity can have several architectures

                    Architectures can describe design using

                    BehaviorndashStructurendashDataflow

                    Architectures can describe design on many levelsndashGate levelndashRTL (Register

                    Transfer Level)ndashBehavioral level

                    Configuration declaration links architecture to entity

                    Eg

                    Architecture Comparator1 of Comparator is

                    Begin

                    EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                    End Comparator1

                    Configurations Links entity declaration and architecture body together

                    Concept of default configuration is a bit messy in VHDL lsquo87

                    ndashLast architecture analyzed links to entity

                    Can be used to change simulation behavior without re-analyzing the VHDL

                    source

                    Complex configuration declarations are ignored in synthesis

                    36

                    Some entities can have eggate level architecture and behavioral

                    architecture

                    Are always optional

                    PackagesPackages contain information common to many design units

                    1 Package declaration

                    --constant declarations

                    ndashtype and subtype declarations

                    ndashfunction and procedure declarations

                    ndashglobal signal declarations

                    ndashfile declarations

                    ndashcomponent declarations

                    2 Package body

                    ndashis not necessary needed

                    ndashfunction bodies

                    ndashprocedure bodies

                    Packages are meant for encapsuling data which can be shared globally among

                    Several design units

                    Consists of declaration part and optional body part

                    Package declaration can contain

                    ndashtype and subtype declarations

                    ndashsubprograms

                    ndashconstants

                    Alias declarations

                    ndashglobal signal declarations

                    ndashfile declarations

                    ndashcomponent declarations

                    37

                    Package body consists of

                    ndashsubprogram declarations and bodies

                    ndashtype and subtype declarations

                    ndash deferred constants

                    ndash file declarations

                    LibrariesCollection of VHDL design units (database)

                    1 Packages

                    package declaration

                    package body

                    2 Entities (entity declaration)

                    3 Architectures (architecture body)

                    4 Configurations (configuration declarations)

                    Usually directory in UNIX file system

                    Can be also any other kind of database

                    Levels of Abstraction

                    VHDL supports many possible styles of design description which differ

                    primarily in how closely they relate to the HW

                    It is possible to describe a circuit in a number of ways

                    Structural-------

                    Dataflow ------- Higher level of abstraction

                    Behavioral -------

                    Structural VHDL description

                    Circuit is described in terms of its components

                    From a low-level description (eg transistor-level description) to a high level

                    description (eg block diagram)

                    For large circuits a low-level description quickly becomes impractical

                    103 Dataflow VHDL Description

                    38

                    Circuit is described in terms of how data moves through the system

                    In the dataflow style you describe how information flows between registers in

                    the system

                    The combinational logic is described at a relatively high level the placement

                    and operation of registers is specified quite precisely

                    The behavior of the system over the time is defined by registers

                    There are no build-in registers in VHDL-language

                    ndashEither lower level description

                    ndashor behavioral description of sequential elements is needed

                    The lower level register descriptions must be created or obtained

                    If there is no 3rd party models for registers =gt you must write the behavioral

                    description of registers

                    The behavioral description can be provided in the form of

                    subprograms(functions or procedures)

                    104 Behavioral VHDL Description

                    Circuit is described in terms of its operation over time

                    39

                    Representation might include eg state diagrams timing diagrams and

                    algorithmic descriptions

                    The concept of time may be expressed precisely using delays (eg A lt= B

                    after 10 ns)

                    If no actual delay is used order of sequential operations is defined

                    In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                    timing specifications

                    The actual timing results depend on implementation technology and efficiency

                    of synthesis tool

                    There are a few tools for behavioral synthesis

                    Concurrent Vs Sequential

                    Processes

                    Basic simulation concept in VHDL

                    VHDL description can always be broken up to interconnected processes

                    Quite similar to Unix process

                    40

                    Process keyword in VHDL

                    Process statement is concurrent statement

                    Statements inside process statements are sequential statements

                    Process must contain either sensitivity list or wait statement(s) but NOT both

                    Sensitivity list or wait statement(s) contains signals which wakes process up

                    General format

                    Process [(sensitivity_list)]

                    process_declarative_part

                    Begin

                    process_statements

                    [wait_statement]

                    End process

                    41

                    11 VGA CONTROLLER CODE

                    library IEEE

                    use IEEESTD_LOGIC_1164ALL

                    use IEEESTD_LOGIC_ARITHALL

                    use IEEESTD_LOGIC_UNSIGNEDALL

                    use IEEEnumeric_stdALL

                    ---- Uncomment the following library declaration if instantiating

                    ---- any Xilinx primitives in this code

                    --library UNISIM

                    --use UNISIMVComponentsall

                    entity vga_controller is port(clk50min std_logic

                    rstnin std_logic

                    hsyncout std_logic

                    vsyncout std_logic

                    redout std_logic_vector(0 downto 0)

                    greenout std_logic_vector(0 downto 0)

                    blueout std_logic_vector(0 downto 0))

                    end vga_controller

                    architecture Behavioral of vga_controller is

                    component memory

                    port (

                    addr IN std_logic_VECTOR(17 downto 0)

                    clk IN std_logic

                    dout OUT std_logic_VECTOR(0 downto 0)

                    en IN std_logic

                    sinit IN std_logic)

                    END component

                    42

                    component sync_unit port(

                    clkin std_logic

                    rstnin std_logic

                    hsyncout std_logic

                    ---horizontal synch pulse(row)

                    vsyncout std_logic

                    --vertical sync pulse(frame) 111110100000000000

                    v_en_regout std_logic

                    h_enout std_logic

                    v_enout std_logic

                    line_countinout integer

                    c_hsinout integer)

                    end component

                    component pixelgen port(

                    clkin std_logic

                    rstnin std_logic

                    hsyncin std_logic

                    vsyncin std_logic

                    h_enin std_logic

                    v_enin std_logic

                    v_en_reg in std_logic

                    data_inin std_logic_vector(0 downto 0)

                    line_countin integer

                    c_hsin integer

                    red out std_logic_vector(0 downto 0)

                    greenout std_logic_vector(0 downto 0)

                    blueout std_logic_vector(0 downto 0)

                    addrout std_logic_vector(17 downto 0)

                    rdout std_logic)

                    end component

                    43

                    signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                    signal line_count_sc_hs_saddrinteger

                    signal data_sstd_logic_vector(0 downto 0)

                    signal addr_sstd_logic_vector(17 downto 0)

                    begin

                    --addr_s lt= std_logic_vector(to_unsigned(addr18))

                    process(clk50mrstn)

                    begin

                    if(rstn = 0)then

                    clk lt= 0

                    elsif(clk50m = 1 and clk50mevent)then

                    clk lt= not clk

                    end if

                    --end if

                    end process

                    hsync lt= hsync_s

                    vsync lt= vsync_s

                    sync sync_unit port map(

                    clk =gt clk

                    rstn =gt rstn

                    hsync =gt hsync_s

                    vsync =gt vsync_s

                    v_en_reg =gt v_en_reg_s

                    line_count =gt line_count_s

                    h_en =gt h_en_s

                    v_en =gt v_en_s

                    c_hs =gt c_hs_s

                    )

                    44

                    pixel pixelgen port map(

                    clk =gt clk

                    rstn =gt rstn

                    hsync =gt hsync_s

                    vsync =gt vsync_s

                    h_en =gt h_en_s

                    v_en =gt v_en_s

                    v_en_reg =gt v_en_reg_s

                    data_in =gt data_s

                    line_count=gt line_count_s

                    c_hs =gt c_hs_s

                    red =gt red

                    green =gt green

                    blue =gt blue

                    addr =gt addr_s

                    rd =gt rd_s

                    )

                    mem12 memory port map(

                    addr =gt addr_s

                    clk =gt clk

                    dout =gt data_s

                    en =gt rd_s

                    sinit=gt rstn

                    )

                    end Behavioral

                    45

                    SYNCHRONIZATION BLOCK CODE

                    library IEEE

                    use IEEESTD_LOGIC_1164ALL

                    use IEEESTD_LOGIC_ARITHALL

                    use IEEESTD_LOGIC_UNSIGNEDALL

                    entity sync_unit is port(

                    clkin std_logic

                    rstnin std_logic

                    hsyncout std_logic ---horizontal synch pulse(row)

                    vsyncout std_logic --vertical sync

                    pulse(frame) 111110100000000000

                    v_en_regout std_logic

                    h_enout std_logic

                    v_enout std_logic

                    line_countout integer

                    c_hsout integer )

                    end sync_unit

                    architecture arch_sync_unit of sync_unit is

                    signal h_en_sv_en_sstd_logic

                    signal line_count_sc_hs_sinteger

                    begin

                    c_hs lt= c_hs_s

                    line_count lt= line_count_s

                    h_en lt= h_en_s

                    v_en lt= v_en_s

                    -- hsync counter

                    process(clkrstn)

                    begin

                    if(rstn = 0)then

                    46

                    c_hs_s lt= 0

                    else

                    if(clk = 1 and clkevent)then

                    if(c_hs_s=793)then

                    c_hs_s lt= 0

                    else

                    c_hs_s lt= c_hs_s +1

                    end if

                    end if

                    end if

                    end process

                    --vsync counter

                    --vertical line counter

                    process(h_en_srstn)

                    begin

                    if(rstn = 0)then

                    line_count_s lt= 0

                    elsif(h_en_s = 1 and h_en_sevent)then

                    line_count_s lt= line_count_s + 1

                    if (line_count_s=515)then

                    line_count_s lt= 0

                    end if

                    end if

                    --end if

                    end process

                    --hysnc pulse

                    process(clkrstn)

                    begin

                    if(rstn = 0)then

                    47

                    hsync lt= 1

                    elsif(clk = 1 and clkevent)then

                    if(c_hs_s lt=95)then

                    hsync lt= 0

                    else

                    hsync lt= 1

                    end if

                    end if

                    --end if

                    end process

                    process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                    this period)

                    begin

                    if(rstn = 0)then

                    h_en_s lt= 0

                    elsif(clk = 1 and clkevent)then

                    if((c_hs_s gt=143) and (c_hs_slt=778))then

                    h_en_s lt= 1

                    else

                    h_en_s lt= 0

                    end if

                    end if

                    --end if

                    end process

                    ---vysnc pulse

                    process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                    begin

                    if(rstn = 0)then

                    48

                    v_en_s lt= 0

                    elsif(clk = 1 and clkevent)then

                    if((line_count_sgt=35 and line_count_slt=515))then

                    v_en_s lt= 1

                    else

                    v_en_s lt= 0

                    end if

                    end if

                    --end if

                    end process

                    process(clkrstn) --Vertical Sync Front porch Back Porch

                    begin

                    if(rstn = 0)then

                    vsync lt= 1

                    elsif(clk = 1 and clkevent)then

                    if(line_count_s lt2)then

                    vsync lt= 0

                    else

                    vsync lt= 1

                    end if

                    end if

                    --end if

                    end process

                    process(clk)

                    begin

                    if(clk = 1 and clkevent)then

                    v_en_reglt=v_en_s

                    end if

                    end process

                    end arch_sync_unit

                    49

                    PIXEL GENERATOR

                    library IEEE

                    use IEEESTD_LOGIC_1164ALL

                    use IEEESTD_LOGIC_ARITHALL

                    use IEEESTD_LOGIC_UNSIGNEDALL

                    entity pixelgen is port(

                    clkin std_logic

                    rstnin std_logic

                    hsyncin std_logic

                    vsyncin std_logic

                    h_enin std_logic

                    v_enin std_logic

                    v_en_reg in std_logic

                    data_inin std_logic_vector(0 downto 0)

                    line_countin integer

                    c_hsin integer

                    red out std_logic_vector(0 downto 0)

                    greenout std_logic_vector(0 downto 0)

                    blueout std_logic_vector(0 downto 0)

                    addrout std_logic_vector(17 downto 0)

                    rdout std_logic

                    )

                    end pixelgen

                    architecture behv of pixelgen is

                    signal addr_sstd_logic_vector(17 downto 0)

                    begin

                    addr lt= addr_s

                    process(clkrstn)

                    begin

                    50

                    if(rstn = 0)then

                    rd lt= 0

                    red lt= 0

                    green lt= 0

                    blue lt= 0

                    addr_s lt= x0000 amp 00

                    elsif (clk = 1 and clkevent)then

                    if((v_en=1) and (v_en_reg=0))then

                    addr_slt=(others =gt0)

                    end if

                    if(v_en=1 and h_en=1)then

                    if(line_countlt=514)then

                    if(c_hs lt= 655)then

                    rd lt= 1

                    red lt= data_in

                    green lt= data_in

                    blue lt= data_in

                    addr_s lt=addr_s +000000000000000001

                    else

                    rd lt= 0

                    red lt= 0

                    green lt= 1

                    blue lt= 0

                    end if

                    else

                    rd lt= 0

                    red lt= 1

                    green lt= 1

                    blue lt= 1

                    51

                    end if

                    if(addr_s=111011111111111111)then

                    addr_slt= (others =gt0)

                    end if

                    else

                    red lt= 0

                    green lt= 0

                    blue lt= 0

                    end if

                    end if

                    -- end if

                    --end if

                    end process

                    end behv

                    52

                    12 FPGA ARCHITECTURE

                    A field-programmable gate array (FPGA) is a semiconductor device that can

                    be configured by the customer or designer after manufacturingmdashhence the

                    name field-programmable FPGAs are programmed using a logic circuit

                    diagram or a source code in a hardware description language (HDL) to specify

                    how the chip will work They can be used to implement any logical function

                    that an application-specific integrated circuit (ASIC) could perform but the

                    ability to update the functionality after shipping offers advantages for many

                    applications

                    FPGAs contain programmable logic components called logic blocks and a

                    hierarchy of reconfigurable interconnects that allow the blocks to be wired

                    togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                    can be configured to perform complex combinational functions or merely

                    simple logic gates like AND and XOR In most FPGAs the logic blocks also

                    include memory elements which may be simple flip-flops or more complete

                    blocks of memory

                    History The FPGA industry sprouted from programmable read only

                    memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                    both had the option of being programmed in batches in a factory or in the field

                    (field programmable) however programmable logic was hard-wired between

                    logic gates

                    Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                    first commercially viable field programmable gate array in 1985 ndash the

                    XC2064 The XC2064 had programmable gates and programmable

                    interconnects between gates the beginnings of a new technology and market

                    The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                    input lookup tables (LUTs) More than 20 years later Freeman was entered

                    into the National Inventors Hall of Fame for his invention

                    53

                    Some of the industryrsquos foundational concepts and technologies for

                    programmable logic arrays gates and logic blocks are founded in patents

                    awarded to David W Page and LuVerne R Peterson in 1985

                    In the late 1980s the Naval Surface Warfare Department funded an experiment

                    proposed by Steve Casselman to develop a computer that would implement

                    600000 reprogrammable gates Casselman was successful and the system was

                    awarded a patent in 1992

                    Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                    1990s when competitors sprouted up eroding significant market-share By

                    1993 Actel was serving about 18 percent of the market

                    The 1990s were an explosive period of time for FPGAs both in sophistication

                    and the volume of production In the early 1990s FPGAs were primarily used

                    in telecommunications and networking By the end of the decade FPGAs

                    found their way into consumer automotive and industrial applications

                    FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                    algorithm technology and FPGAs to create a sound recognition device

                    Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                    to decide the configuration needed to accomplish a sound recognition task

                    121 Modern developments

                    A recent trend has been to take the coarse-grained architectural

                    approach a step further by combining the logic blocks and interconnects of

                    traditional FPGAs with embedded microprocessors and related peripherals to

                    form a complete system on a programmable chip This work mirrors the

                    architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                    Group which combined a reconfigurable CPU architecture on a single chip

                    called the SB24 That work was done in 1982 Examples of such hybrid

                    technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                    which include one or more PowerPC processors embedded within the FPGAs

                    54

                    logic fabric The Atmel FPSLIC is another such device which uses an AVR

                    processor in combination with Atmels programmable logic architecture

                    An alternate approach to using hard-macro processors is to make use of soft

                    processor cores that are implemented within the FPGA logic (See Soft

                    processors below)

                    As previously mentioned many modern FPGAs have the ability to be

                    reprogrammed at run time and this is leading to the idea of reconfigurable

                    computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                    suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                    example of a reconfigurable soft processor implemented on FPGAs However

                    it does not support dynamic reconfiguration at runtime but instead adapts itself

                    to a specific program

                    Additionally new non-FPGA architectures are beginning to emerge

                    Software-configurable microprocessors such as the Stretch S5000 adopt a

                    hybrid approach by providing an array of processor cores and FPGA-like

                    programmable cores on the same chip

                    Gates

                    1987 9000 gates Xilinx

                    1992 600000 Naval Surface Warfare Department

                    Early 2000s Millions

                    Market size

                    1985 First commercial FPGA technology invented by Xilinx

                    1987 $14 million

                    ~1993 gt$385 million

                    2005 $19 billion

                    2010 estimates $275 billion

                    55

                    122 FPGA Comparisons

                    Historically FPGAs have been slower less energy efficient and generally

                    achieved less functionality than their fixed ASIC counterparts A combination

                    of volume fabrication improvements research and development and the IO

                    capabilities of new supercomputers have largely closed the performance gap

                    between ASICs and FPGAs

                    Advantages include a shorter time to market ability to re-program in the field

                    to fix bugs and lower non-recurring engineering costs Vendors can also take a

                    middle road by developing their hardware on ordinary FPGAs but

                    manufacture their final version so it can no longer be modified after the design

                    has been committed

                    Xilinx claims that several market and technology dynamics are changing the

                    ASICFPGA paradigm

                    IC costs are rising aggressively

                    ASIC complexity has bolstered development time and costs

                    RampD resources and headcount is decreasing

                    Revenue losses for slow time-to-market are increasing

                    Financial constraints in a poor economy are driving low-cost technologies

                    These trends make FPGAs a better alternative than ASICs for a growing

                    number of higher-volume applications than they have been historically used

                    for which the company blames for the growing number of FPGA design starts

                    (see History)

                    The primary differences between CPLDs and FPGAs are architectural A

                    CPLD has a somewhat restrictive structure consisting of one or more

                    programmable sum-of-products logic arrays feeding a relatively small number

                    of clocked registers The result of this is less flexibility with the advantage of

                    more predictable timing delays and a higher logic-to-interconnect ratio The

                    FPGA architectures on the other hand are dominated by interconnect This

                    56

                    makes them far more flexible (in terms of the range of designs that are

                    practical for implementation within them) but also far more complex to design

                    for

                    Another notable difference between CPLDs and FPGAs is the presence in most

                    FPGAs of higher-level embedded functions (such as adders and multipliers)

                    and embedded memories as well as to have logic blocks implement decoders

                    or mathematical functions

                    Some FPGAs have the capability of partial re-configuration that lets one

                    portion of the device be re-programmed while other portions continue running

                    123 Applications

                    Applications of FPGAs include digital signal processing software-defined

                    radio aerospace and defense systems ASIC prototyping medical imaging

                    computer vision speech recognition cryptography bioinformatics computer

                    hardware emulation radio astronomy and a growing range of other areas

                    FPGAs originally began as competitors to CPLDs and competed in a similar

                    space that of glue logic for PCBs As their size capabilities and speed

                    increased they began to take over larger and larger functions to the state where

                    some are now marketed as full systems on chips (SoC) Particularly with the

                    introduction of dedicated multipliers into FPGA architectures in the late 1990s

                    applications which had traditionally been the sole reserve of DSPs began to

                    incorporate FPGAs instead

                    FPGAs especially find applications in any area or algorithm that can make use

                    of the massive parallelism offered by their architecture One such area is code

                    breaking in particular brute-force attack of cryptographic algorithms

                    FPGAs are increasingly used in conventional high performance computing

                    applications where computational kernels such as FFT or Convolution are

                    performed on the FPGA instead of a microprocessor

                    57

                    The inherent parallelism of the logic resources on an FPGA allows for

                    considerable computational throughput even at a low MHz clock rates The

                    flexibility of the FPGA allows for even higher performance by trading off

                    precision and range in the number format for an increased number of parallel

                    arithmetic units This has driven a new type of processing called

                    reconfigurable computing where time intensive tasks are offloaded from

                    software to FPGAs

                    The adoption of FPGAs in high performance computing is currently limited by

                    the complexity of FPGA design compared to conventional software and the

                    extremely long turn-around times of current design tools where 4-8 hours wait

                    is necessary after even minor changes to the source code

                    Traditionally FPGAs have been reserved for specific vertical applications

                    where the volume of production is small For these low-volume applications

                    the premium that companies pay in hardware costs per unit for a

                    programmable chip is more affordable than the development resources spent

                    on creating an ASIC for a low-volume application Today new cost and

                    performance dynamics have broadened the range of viable applications

                    Architecture

                    The most common FPGA architecture consists of an array of configurable

                    logic blocks (CLBs) IO pads and routing channels Generally all the routing

                    channels have the same width (number of wires) Multiple IO pads may fit

                    into the height of one row or the width of one column in the array

                    An application circuit must be mapped into an FPGA with adequate resources

                    While the number of CLBs and IOs required is easily determined from the

                    design the number of routing tracks needed may vary considerably even

                    among designs with the same amount of logic (For example a crossbar switch

                    requires much more routing than a systolic array with the same gate count)

                    Since unused routing tracks increase the cost (and decrease the performance)

                    of the part without providing any benefit FPGA manufacturers try to provide

                    58

                    just enough tracks so that most designs that will fit in terms of LUTs and IOs

                    can be routed This is determined by estimates such as those derived from

                    Rents rule or by experiments with existing designs

                    The FPGA is an array or island-style FPGA It consists of an array of logic

                    blocks and routing channels Two IO pads fit into the height of one row or the

                    width of one column as shown below All the routing channels have the same

                    width (number of wires)

                    121 FPGA structure

                    A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                    flip-flop as shown below In recent years manufacturers have started moving

                    to 6-input LUTs in their high performance parts claiming increased

                    performance

                    Typical logic block

                    59

                    There is only one output which can be either the registered or the unregistered

                    LUT output The logic block has four inputs for the LUT and a clock input

                    Since clock signals (and often other high-fanout signals) are normally routed

                    via special-purpose dedicated routing networks in commercial FPGAs they

                    and other signals are separately managed

                    For this example architecture the locations of the FPGA logic block pins are

                    shown below

                    Logic Block Pin Locations

                    Each input is accessible from one side of the logic block while the output pin

                    can connect to routing wires in both the channel to the right and the channel

                    below the logic block

                    Each logic block output pin can connect to any of the wiring segments in the

                    channels adjacent to it

                    Similarly an IO pad can connect to any one of the wiring segments in the

                    channel adjacent to it For example an IO pad at the top of the chip can

                    connect to any of the W wires (where W is the channel width) in the horizontal

                    channel immediately below it

                    Generally the FPGA routing is unsegmented That is each wiring segment

                    spans only one logic block before it terminates in a switch box By turning on

                    60

                    some of the programmable switches within a switch box longer paths can be

                    constructed For higher speed interconnect some FPGA architectures use

                    longer routing lines that span multiple logic blocks

                    Whenever a vertical and a horizontal channel intersect there is a switch box In

                    this architecture when a wire enters a switch box there are three

                    programmable switches that allow it to connect to three other wires in adjacent

                    channel segments The pattern or topology of switches used in this

                    architecture is the planar or domain-based switch box topology In this switch

                    box topology a wire in track number one connects only to wires in track

                    number one in adjacent channel segments wires in track number 2 connect

                    only to other wires in track number 2 and so on The figure below illustrates

                    the connections in a switch box

                    Switch box topology

                    Modern FPGA families expand upon the above capabilities to include higher

                    level functionality fixed into the silicon Having these common functions

                    embedded into the silicon reduces the area required and gives those functions

                    increased speed compared to building them from primitives Examples of these

                    include multipliers generic DSP blocks embedded processors high speed IO

                    logic and embedded memories

                    61

                    FPGAs are also widely used for systems validation including pre-silicon

                    validation post-silicon validation and firmware development This allows chip

                    companies to validate their design before the chip is produced in the factory

                    reducing the time to market

                    124 FPGA Design and Programming

                    To define the behavior of the FPGA the user provides a hardware description

                    language (HDL) or a schematic design The HDL form might be easier to work

                    with when handling large structures because its possible to just specify them

                    numerically rather than having to draw every piece by hand On the other hand

                    schematic entry can allow for easier visualisation of a design

                    Then using an electronic design automation tool a technology-mapped netlist

                    is generated The netlist can then be fitted to the actual FPGA architecture

                    using a process called place-and-route usually performed by the FPGA

                    companys proprietary place-and-route software The user will validate the

                    map place and route results via timing analysis simulation and other

                    verification methodologies Once the design and validation process is

                    complete the binary file generated (also using the FPGA companys

                    proprietary software) is used to (re)configure the FPGA

                    Going from schematicHDL source files to actual configuration The source

                    files are fed to a software suite from the FPGACPLD vendor that through

                    different steps will produce a file This file is then transferred to the

                    FPGACPLD via a serial interface (JTAG) or to an external memory device

                    like an EEPROM

                    The most common HDLs are VHDL and Verilog although in an attempt to

                    reduce the complexity of designing in HDLs which have been compared to the

                    equivalent of assembly languages there are moves to raise the abstraction level

                    through the introduction of alternative languages

                    62

                    To simplify the design of complex systems in FPGAs there exist libraries of

                    predefined complex functions and circuits that have been tested and optimized

                    to speed up the design process These predefined circuits are commonly called

                    IP cores and are available from FPGA vendors and third-party IP suppliers

                    (rarely free and typically released under proprietary licenses) Other

                    predefined circuits are available from developer communities such as Open

                    Cores (typically free and released under the GPL BSD or similar license) and

                    other sources

                    In a typical design flow an FPGA application developer will simulate the

                    design at multiple stages throughout the design process Initially the RTL

                    description in VHDL or Verilog is simulated by creating test benches to

                    simulate the system and observe results Then after the synthesis engine has

                    mapped the design to a netlist the netlist is translated to a gate level

                    description where simulation is repeated to confirm the synthesis proceeded

                    without errors Finally the design is laid out in the FPGA at which point

                    propagation delays can be added and the simulation run again with these

                    values back-annotated onto the netlist

                    Basic Process Technology Types

                    SRAM - based on static memory technology In-system programmable and re-

                    programmable Requires external boot devices CMOS

                    Antifuse - One-time programmable CMOS

                    EPROM - Erasable Programmable Read-Only Memory technology Usually

                    one-time programmable in production because of plastic packaging

                    Windowed devices can be erased with ultraviolet (UV) light CMOS

                    EEPROM - Electrically Erasable Programmable Read-Only Memory

                    technology Can be erased even in plastic packages Some but not all

                    EEPROM devices can be in-system programmed CMOS

                    Flash - Flash-erase EPROM technology Can be erased even in plastic

                    packages Some but not all flash devices can be in-system programmed

                    63

                    Usually a flash cell is smaller than an equivalent EEPROM cell and is

                    therefore less expensive to manufacture CMOS

                    Fuse - One-time programmable Bipolar

                    Major Manufacturers

                    Xilinx and Altera are the current FPGA market leaders and long-time industry

                    rivals Together they control over 80 percent of the market with Xilinx alone

                    representing over 50 percent

                    Xilinx also provides free Windows and Linux design software while Altera

                    provides free Windows tools the Solaris and Linux tools are only available via

                    a rental scheme

                    Other competitors include Lattice Semiconductor (flash SRAM) Actel

                    (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                    Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                    focused CSSP no general purpose FPGAs)

                    125 FPGA prototype

                    FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                    prototyping is the method to prototype SoC and ASIC design on FPGA for

                    hardware verification and early software development

                    Main stream verification methods for hardware design and early software and

                    firmware co-design has become mainstream Prototyping SoC and ASIC

                    design on FPGA has become a good method to do this

                    64

                    Reasons why Prototyping is important

                    1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                    functionally correct This is compared to designers only relying on software

                    simulations to verify that their hardware design is sound Simulation speed and

                    modeling accuracy limitations hinder this development

                    2 Due to time constrains many projects cannot wait until the silicon is back from

                    the foundry to start on software tests FPGA prototyping allows for much more

                    time in area of software development and testing at the software-hardware

                    integration stage This allows many unforeseen software bugs that appear due

                    to todays array of operating systems applications and hardware

                    3 Prototyping also allows the developer to ensure that all IP technologies on his

                    system work well together off the simulation stage and in actual form

                    4 Prototyping has the added advantage as demo platforms to SoC clients

                    bringing in interest early This speeds up the overall development cycle and

                    allows for more enhancement or improvement to the chip features as it would

                    otherwise have been

                    65

                    13 SIMULATION RESULTS

                    66

                    GATE LEVEL

                    FigVGA CONTROLLER

                    67

                    GATE LEVEL

                    Fig SYNC UNIT

                    68

                    TECHNOLOGY SCHEMATIC

                    69

                    14 APPLICATIONS

                    Motion Tracking Systems

                    Image processing systems

                    Displaying Systems

                    141 CONCLUTION

                    Hence by implementing FPGA taking a processor-generated picture

                    (pixilated) from memory space the Controller provides digital RGB values for

                    each pixel as well as horizontal and vertical synchronization signals in order

                    to correctly display the picture on a connected monitor

                    70

                    REFERENCES

                    1 httpenwikipediaorgwikiVideo_Graphics_Array

                    2 A VGA display controller by Eduardo Sanchez

                    3 HinnerrdquoVGA Interface and video signal documentsrdquo

                    71

                    • Pixel
                      • 6 SIGNALS IN VGA CONTROLLER
                      • Gates
                      • Market size
                      • 122 FPGA Comparisons

                      61 The vertical timings are

                      Parameter Value Unit

                      Vertical lines 480

                      Vertical sync polarity Negative

                      Vertical frequency 5994 Hz

                      Front porch (E) 035 ms

                      Sync pulse length (F) 006 ms

                      Back porch (G) 102 ms

                      Active video (H) 1525 ms

                      (Total vertical sync time 143 ms)

                      640 x 400 70 Hz is video mode used for booting most x86 personal

                      computers

                      640 x 480 60 Hz is the default MS-Windows graphics mode with 16 colors

                      It should be noted that the actual timings vary slightly For example for

                      640x480 60fps a 2517 micros active video time with a pixel frequency of

                      25174 MHz gives 633 pixels rather than the expected 640 pixels

                      62 Generation of the horizontal synchronization signal ndash HSYNC

                      11

                      The HSYNC signal is High (inactive) after an external reset signal

                      (RST) is received by the VGA Controller The signal is updated on each rising

                      edge of the external clock signal (CLK)

                      The state of the HSYNC signal depends on the value stored in the horizontal

                      counter and is driven low when

                      HCOUNT ge (PixTotal + BlankingLeft)

                      and remains low while

                      HCOUNT lt (PixTotal + BlankingLeft + TLSync)

                      Vertical (Frame) Period

                      63 Generation of the vertical synchronization signal - VSYNC

                      The VSYNC signal is High (inactive) after an external reset signal

                      (RST) is received by the VGA Controller The signal is updated after every

                      line of pixels is completed (ie on each rising edge of the HSYNC signal)

                      The state of the VSYNC signal depends on the value stored in the vertical

                      counter and is driven low when

                      VCOUNT ge (LinTotal + BlankingLeft)

                      and remains low while

                      VCOUNT lt (LinTotal + BlankingLeft + TFSync)

                      Address Counter (MEMADDR)

                      The address counter is used to store the position of the next consecutive

                      pixel in the frame Its value is passed to the Pixel Unit on the internal bus

                      signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

                      the next pixel from picture memory

                      The counter is reset to zero when the VGA Controller receives an external

                      reset signal (RST) The size of the counter depends on the values chosen for

                      DISPSIZE_H and DISPSIZE_V as the range is simply

                      12

                      0 to (PixTotal x LinTotal) - 1

                      Taking the maximum number of pixels in a line and lines in a frame for each

                      of the supported display resolutions the maximum ranges for the counter are

                      bull 640x480 0 to 307199

                      bull 800x600 0 to 479999

                      The counter has 19-bit resolution

                      While the value in the horizontal counter (HCOUNT) is less than the

                      total number of viewable pixels in a line (PixTotal the integer value of

                      DISPSIZE_H) the counter is incremented on the rising edge of the external

                      clock signal (CLK) Pixel addressing within the frame is consecutive When

                      the counter reaches the last pixel in a line its incremented value is the first

                      pixel in the next line down

                      The address counter will continue to be incremented until the value in the

                      vertical counter (VCOUNT) is greater than or equal to the total number of

                      viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

                      point it will be rolled over to zero

                      13

                      7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

                      microcontroller and any VGA-compatible monitor This variant of the

                      Controller provides six modes of display depending on the resolution chosen

                      (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

                      White 16 Colors or 64 Colors)

                      71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

                      For 1sec = 60 frames

                      1 frame = max 160 sec

                      1 pixel = 160 sec = 50ns

                      307200

                      Max Time for 1 pixel = 40ns

                      2) To send information for each and every pixel of a frame by means of

                      RGB signals

                      3) After completely displaying 1 row information the column must be

                      refreshed and the row has to be incremented ie done by Hsync signal

                      4) After completing 1 frame of information a new frame has to be started

                      for this row and column are to be refreshed ie done by Vsync signal

                      14

                      =gt For giving information for 1 pixel we use 1 clock (for controlling all

                      operations)

                      1 clock = 1 pixel

                      1 clock cycle period = 40ns

                      Clock frequency = 1time period =140ns = 25 mhz

                      1 clock frequency = 25 mhz

                      So if we use 25mhz clock and display 1 pixel information for each and enery

                      clock then we use can attain min 60hz refreshing rate

                      do not use the clock frequency less than 25mhz

                      Ex

                      15MHz 25MHz

                      30MHz

                      Time period = more effectively utilized frequency

                      Less

                      No of framessec = less

                      More

                      Decreases

                      Increases

                      (It is unable to display 60framessec)

                      Signals used for image display-

                      R (Red)

                      15

                      G (Green)

                      B (Blue)

                      HSYNC (Horizontal Synchronous signal)

                      VSYNC (Vertical Synchronous signal)

                      All the above signals are used to control the image data to be correctly

                      displayed on the VGA display monitor

                      72 Timings for front port sync pulse back port-

                      Horizontal sync Vertical sync

                      Front port 094micros Front port 035ms

                      Sync pulse 377 micros Sync pulse 0063ms

                      Back port 189 micros Back port 102ms

                      16

                      8 FUNCTIONAL DESCRIPTION81 Symbol

                      VGA Controller Symbol

                      Pin description

                      The pin description is shown in the below table

                      Table VGA Pin description

                      Name Type Polarity Bus size Description

                      Global Control Signals

                      CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                      17

                      640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                      RST I High Global system reset

                      VGA Input Settings Signals

                      RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                      CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                      DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                      DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                      Data Memory Control Signals

                      RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                      18

                      the internal line enable signal en generated by the Synchronization Unit of the Controller

                      DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                      ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                      VGA Monitor Control Signals

                      HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                      VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                      19

                      VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                      R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                      G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                      S

                      B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                      20

                      9 VGA SIGNAL TIMING

                      Figure 3 summarizes the signal timing involved in sending a line of

                      pixels and a frame of lines The actual time values differ according to the

                      resolution selected (640x480 or 800x600) the processor-defined values for

                      DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                      (CLK)

                      21

                      91 VGA PIXEL UNIT

                      The Pixel Unit provides access to the pixilated image stored in external

                      picture memory reading in data a byte at a time and formatting each pixel to

                      be displayed For each pixel the 6-bit RGB value required for the control of

                      the monitorrsquos three primary color electron guns is generated so that the pixel

                      is displayed on the screen with the correct color

                      Accessing the stored image

                      The image to be displayed on the monitor screen is written by the host

                      microcontroller into external memory space (RAM) This memory space can

                      be located anywhere (eg using a block of RAM within the FPGA design or

                      using the SRAM on the NanoBoard itself)

                      Pixel data is stored in the memory space in bytes The number of pixels in a

                      byte depends on which color palette is being used for the image (selected by

                      the CMOD input)

                      bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                      bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                      bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                      (DATA[76] are not used)

                      22

                      The size of memory required to store a picture is determined by the total

                      number of viewable pixels in a line (determined by DISPSIZE_H) the total

                      number of viewable lines in a frame (determined by DISPSIZE_V) and the

                      number of pixels stored in each byte in memory space

                      Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                      The address in RAM where the next pixel is stored is determined using

                      an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                      reflects the current contents of the MEMADDR register The exact addressing

                      is described below

                      92 Black amp White mode

                      The picture memory address ndash the byte of data containing the next 8 pixels

                      ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                      three

                      ADDR_PIXEL = 00 amp hvcnt[183]

                      16 Colors mode

                      The picture memory address ndash the byte of data containing the next 2 pixels

                      ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                      one

                      ADDR_PIXEL = 0 amp hvcnt[181]

                      64 Colors mode

                      The picture memory address ndash the byte of data containing the next pixel ndash is

                      determined by using the full value of hvcnt

                      ADDR_PIXEL = hvcnt[180]

                      The Pixel register (PIXREG)

                      The Pixel register is used to receive the byte of pixel data read from the

                      current address in memory space The register PIXREG is reset to zero when

                      the VGA Controller receives an external reset signal (RST)

                      23

                      The register is updated on each rising edge of the CLK signal Data can be

                      read from the memory space as long as the RD signal is active (High) The RD

                      signal is itself controlled by the external line display enable signal enex This

                      internally generated signal is defined as follows

                      If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                      (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                      Else

                      enex = 1

                      When enex is Low read access from memory is disabled (RD = 0)

                      The point at which data is loaded from memory into PIXREG depends on the

                      particular color palette that is chosen ndash Black amp White 16 Colors or 64

                      Colors

                      Black amp White mode

                      The next byte of data will be loaded into the register whenever the

                      lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                      are ldquo000rdquo

                      For the currently loaded byte the active pixel is always in the lowest bit

                      position of the Pixel register Each pixel in the data byte is moved into this

                      active pixel position by shifting the contents of the register right by one bit on

                      each rising edge of CLK

                      16 Colors mode

                      The next byte of data will be loaded into the register whenever the

                      lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                      0

                      For the currently loaded byte the active pixel is always in the low order

                      nibble of the Pixel register Remember that in this mode each byte of data

                      contains two pixels The second pixel is moved into this active pixel position

                      by shifting the contents of the register right by four bits on the rising edge of

                      CLK

                      24

                      64 Colors mode

                      The next byte of data will be loaded into the register on the rising edge

                      of the external system clock signal (CLK) In this mode the read of pixel data

                      does not depend on the status of the hvcnt signal received from the

                      Synchronization Unit

                      93 The RGB register (RGBREG)

                      The RGB register is used to store the six bits that are required for

                      driving the red green and blue color guns of the target monitor When the

                      chosen color palette is either Black amp White or 16 Colors these six bits are

                      obtained by mapping the value of the active pixel to a predefined RGB code

                      When the chosen color palette is 64 Colors the actual pixel value is used

                      directly

                      The register RGBREG is reset to zero (000000) when the VGA

                      Controller receives an external reset signal (RST) This RGB code represents

                      black

                      The register is updated on each rising edge of the CLK signal and the

                      value loaded is dependent on the state of the line display enable signal en

                      When en is Low blanking is required and RGBREG is loaded with the code

                      for black (000000)

                      Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                      and a 4-bit pixel (16 Colors mode) into the required RGB color code

                      25

                      26

                      27

                      The RGB color code stored in the RGB register is output from the VGA

                      Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                      and B1)

                      The monitor itself expects analog signals as inputs to its electron gun control

                      circuits This is achieved by using 2-bit digital to analog converter circuitry

                      located on the Nano Board itself as shown in Figure 4

                      Figure 91 Digital to Analog RGB conversion

                      28

                      For each color the 2-bit digital signal from the VGA Controller can be

                      converted into 4 distinct analog levels These levels specify the intensity of

                      each of the three primary colors to use when displaying the pixel on the

                      monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                      brightness)

                      With each analog input being one of four possible levels the monitor

                      can display each pixel on the screen with one of 64 different color

                      permutations

                      Using Block RAM in Spartan-3E Generation

                      FPGAs

                      For applications requiring large on-chip memories Spartantrade-3

                      Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                      Using various configuration options Select RAM blocks create RAM ROM

                      FIFOs large look-up tables data width converterscircular buffers and shift

                      registers each supporting various data widths and depths This application note

                      describes the features and capabilities of block Select RAM and illustrates how

                      to specify the various options using the Xilinx CORE Generatortrade system or

                      via VHDL or Verilog instantiation

                      29

                      Table 2 Block RAM Available in Spartan-3E Devices

                      Devi

                      ce

                      R

                      A

                      M

                      Co

                      lu

                      m

                      ns

                      R

                      A

                      M

                      Bl

                      oc

                      ks

                      pe

                      r

                      C

                      ol

                      u

                      m

                      n

                      T

                      o

                      t

                      a

                      l

                      R

                      A

                      M

                      B

                      l

                      o

                      c

                      k

                      s

                      T

                      o

                      t

                      a

                      l

                      R

                      A

                      M

                      B

                      it

                      s

                      T

                      o

                      t

                      a

                      l

                      R

                      A

                      M

                      K

                      b

                      i

                      t

                      s

                      XC3

                      S100

                      E

                      1 4 4 7

                      3

                      7

                      2

                      8

                      7

                      2

                      XC3

                      S250

                      E

                      2 6 1

                      2

                      2

                      2

                      1

                      1

                      8

                      4

                      2

                      1

                      6

                      XC3

                      S500

                      E

                      2 10 2

                      0

                      3

                      6

                      8

                      6

                      4

                      3

                      6

                      0

                      30

                      0

                      XC3

                      S120

                      0E

                      2 14 2

                      8

                      5

                      1

                      6

                      0

                      9

                      6

                      5

                      0

                      4

                      XC3

                      S160

                      0E

                      2 18 3

                      6

                      6

                      6

                      3

                      5

                      5

                      2

                      6

                      4

                      8

                      Each block RAM contains 18432 bits of fast static RAM 16K bits of

                      which is allocated to data storage and in some memory configurations an

                      additional 2K bits allocated to parity or additional plus data bits Physically

                      the block RAM memory has two completely independent access ports labeled

                      Port A and Port B The structure is fully symmetrical and both ports are

                      interchangeable and both ports support data read and write operations Each

                      memory port is synchronous with its own clock clock enable and write

                      enable Read operations are also synchronous and require a clock edge and

                      clock enable Though physically a dual-port memory block RAM simulates

                      single-port memory in an application as shown in Figure 1 Furthermore each

                      block memory supports multiple configurations or aspect ratios Table 3

                      summarizes the essential SelectRAM features Cascade multiple block RAMs

                      to create deeper and wider memory organizations with a minimal timing

                      penalty incurred through specialized routing resources

                      31

                      32

                      10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                      Why (V) HDL

                      Interoperability

                      Technology independence

                      Design reuse

                      Several levels of abstraction

                      Readability

                      Standard language

                      Widely supported

                      What is VHDL

                      VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                      IC)

                      Design specification language

                      Design entry language

                      Design simulation language

                      Design documentation language

                      An alternative to schematics

                      101 Brief History

                      VHDL Was developed in the early 1980s for managing design problems that

                      involved large circuits and multiple teams of engineers

                      Funded by US Department of Defence

                      33

                      The first publicly available version was released in 1985

                      In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                      presented with a proposal to standardize the VHDL

                      In 1987 standardization =gt IEEE 1076-1987

                      An improved version of the language was released in 1994 =gt IEEE

                      standard1076-1993

                      102 Related Standards

                      IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                      impedance

                      Soon after IEEE 1076-1987 was released simulator companies began using

                      their own non-standard types =gt VHDL was becoming a nonstandard

                      IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                      definitions for a nine-valued data type std_logic

                      IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                      to actual hardware

                      Defines eg two numeric types signed and unsigned

                      VHDL Environment

                      Design Units

                      34

                      Segments of VHDL code that can be compiled separately and stored in a

                      library

                      Entities

                      A black box with interface definition

                      Defines the inputsoutputs of a component (define pins)

                      A way to represent modularity in VHDL

                      Similar to symbol in schematic

                      Entity declaration describes entity

                      Eg

                      entity Comparator is

                      port (A B in std_logic_vector(7 downto0)

                      EQ out std_logic)

                      end Comparator

                      Ports Provide channels of communication between the component and its

                      environment

                      Each port must have a name direction and a type

                      An entity may have NO port declaration

                      35

                      Port directions In A value of a port can be read inside the component but cannot be assigned

                      Multiple reads of port are allowed

                      Out Assignments can be made to a port but data from a port cannot be read

                      Multiple assignments are allowed

                      In out Bi-directional assignments can be made and data can be read Multiple

                      assignments are allowed

                      Buffer An out port with read capability May have at most one assignment

                      (are not recommended)

                      Architectures Every entity has at least one architecture

                      One entity can have several architectures

                      Architectures can describe design using

                      BehaviorndashStructurendashDataflow

                      Architectures can describe design on many levelsndashGate levelndashRTL (Register

                      Transfer Level)ndashBehavioral level

                      Configuration declaration links architecture to entity

                      Eg

                      Architecture Comparator1 of Comparator is

                      Begin

                      EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                      End Comparator1

                      Configurations Links entity declaration and architecture body together

                      Concept of default configuration is a bit messy in VHDL lsquo87

                      ndashLast architecture analyzed links to entity

                      Can be used to change simulation behavior without re-analyzing the VHDL

                      source

                      Complex configuration declarations are ignored in synthesis

                      36

                      Some entities can have eggate level architecture and behavioral

                      architecture

                      Are always optional

                      PackagesPackages contain information common to many design units

                      1 Package declaration

                      --constant declarations

                      ndashtype and subtype declarations

                      ndashfunction and procedure declarations

                      ndashglobal signal declarations

                      ndashfile declarations

                      ndashcomponent declarations

                      2 Package body

                      ndashis not necessary needed

                      ndashfunction bodies

                      ndashprocedure bodies

                      Packages are meant for encapsuling data which can be shared globally among

                      Several design units

                      Consists of declaration part and optional body part

                      Package declaration can contain

                      ndashtype and subtype declarations

                      ndashsubprograms

                      ndashconstants

                      Alias declarations

                      ndashglobal signal declarations

                      ndashfile declarations

                      ndashcomponent declarations

                      37

                      Package body consists of

                      ndashsubprogram declarations and bodies

                      ndashtype and subtype declarations

                      ndash deferred constants

                      ndash file declarations

                      LibrariesCollection of VHDL design units (database)

                      1 Packages

                      package declaration

                      package body

                      2 Entities (entity declaration)

                      3 Architectures (architecture body)

                      4 Configurations (configuration declarations)

                      Usually directory in UNIX file system

                      Can be also any other kind of database

                      Levels of Abstraction

                      VHDL supports many possible styles of design description which differ

                      primarily in how closely they relate to the HW

                      It is possible to describe a circuit in a number of ways

                      Structural-------

                      Dataflow ------- Higher level of abstraction

                      Behavioral -------

                      Structural VHDL description

                      Circuit is described in terms of its components

                      From a low-level description (eg transistor-level description) to a high level

                      description (eg block diagram)

                      For large circuits a low-level description quickly becomes impractical

                      103 Dataflow VHDL Description

                      38

                      Circuit is described in terms of how data moves through the system

                      In the dataflow style you describe how information flows between registers in

                      the system

                      The combinational logic is described at a relatively high level the placement

                      and operation of registers is specified quite precisely

                      The behavior of the system over the time is defined by registers

                      There are no build-in registers in VHDL-language

                      ndashEither lower level description

                      ndashor behavioral description of sequential elements is needed

                      The lower level register descriptions must be created or obtained

                      If there is no 3rd party models for registers =gt you must write the behavioral

                      description of registers

                      The behavioral description can be provided in the form of

                      subprograms(functions or procedures)

                      104 Behavioral VHDL Description

                      Circuit is described in terms of its operation over time

                      39

                      Representation might include eg state diagrams timing diagrams and

                      algorithmic descriptions

                      The concept of time may be expressed precisely using delays (eg A lt= B

                      after 10 ns)

                      If no actual delay is used order of sequential operations is defined

                      In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                      timing specifications

                      The actual timing results depend on implementation technology and efficiency

                      of synthesis tool

                      There are a few tools for behavioral synthesis

                      Concurrent Vs Sequential

                      Processes

                      Basic simulation concept in VHDL

                      VHDL description can always be broken up to interconnected processes

                      Quite similar to Unix process

                      40

                      Process keyword in VHDL

                      Process statement is concurrent statement

                      Statements inside process statements are sequential statements

                      Process must contain either sensitivity list or wait statement(s) but NOT both

                      Sensitivity list or wait statement(s) contains signals which wakes process up

                      General format

                      Process [(sensitivity_list)]

                      process_declarative_part

                      Begin

                      process_statements

                      [wait_statement]

                      End process

                      41

                      11 VGA CONTROLLER CODE

                      library IEEE

                      use IEEESTD_LOGIC_1164ALL

                      use IEEESTD_LOGIC_ARITHALL

                      use IEEESTD_LOGIC_UNSIGNEDALL

                      use IEEEnumeric_stdALL

                      ---- Uncomment the following library declaration if instantiating

                      ---- any Xilinx primitives in this code

                      --library UNISIM

                      --use UNISIMVComponentsall

                      entity vga_controller is port(clk50min std_logic

                      rstnin std_logic

                      hsyncout std_logic

                      vsyncout std_logic

                      redout std_logic_vector(0 downto 0)

                      greenout std_logic_vector(0 downto 0)

                      blueout std_logic_vector(0 downto 0))

                      end vga_controller

                      architecture Behavioral of vga_controller is

                      component memory

                      port (

                      addr IN std_logic_VECTOR(17 downto 0)

                      clk IN std_logic

                      dout OUT std_logic_VECTOR(0 downto 0)

                      en IN std_logic

                      sinit IN std_logic)

                      END component

                      42

                      component sync_unit port(

                      clkin std_logic

                      rstnin std_logic

                      hsyncout std_logic

                      ---horizontal synch pulse(row)

                      vsyncout std_logic

                      --vertical sync pulse(frame) 111110100000000000

                      v_en_regout std_logic

                      h_enout std_logic

                      v_enout std_logic

                      line_countinout integer

                      c_hsinout integer)

                      end component

                      component pixelgen port(

                      clkin std_logic

                      rstnin std_logic

                      hsyncin std_logic

                      vsyncin std_logic

                      h_enin std_logic

                      v_enin std_logic

                      v_en_reg in std_logic

                      data_inin std_logic_vector(0 downto 0)

                      line_countin integer

                      c_hsin integer

                      red out std_logic_vector(0 downto 0)

                      greenout std_logic_vector(0 downto 0)

                      blueout std_logic_vector(0 downto 0)

                      addrout std_logic_vector(17 downto 0)

                      rdout std_logic)

                      end component

                      43

                      signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                      signal line_count_sc_hs_saddrinteger

                      signal data_sstd_logic_vector(0 downto 0)

                      signal addr_sstd_logic_vector(17 downto 0)

                      begin

                      --addr_s lt= std_logic_vector(to_unsigned(addr18))

                      process(clk50mrstn)

                      begin

                      if(rstn = 0)then

                      clk lt= 0

                      elsif(clk50m = 1 and clk50mevent)then

                      clk lt= not clk

                      end if

                      --end if

                      end process

                      hsync lt= hsync_s

                      vsync lt= vsync_s

                      sync sync_unit port map(

                      clk =gt clk

                      rstn =gt rstn

                      hsync =gt hsync_s

                      vsync =gt vsync_s

                      v_en_reg =gt v_en_reg_s

                      line_count =gt line_count_s

                      h_en =gt h_en_s

                      v_en =gt v_en_s

                      c_hs =gt c_hs_s

                      )

                      44

                      pixel pixelgen port map(

                      clk =gt clk

                      rstn =gt rstn

                      hsync =gt hsync_s

                      vsync =gt vsync_s

                      h_en =gt h_en_s

                      v_en =gt v_en_s

                      v_en_reg =gt v_en_reg_s

                      data_in =gt data_s

                      line_count=gt line_count_s

                      c_hs =gt c_hs_s

                      red =gt red

                      green =gt green

                      blue =gt blue

                      addr =gt addr_s

                      rd =gt rd_s

                      )

                      mem12 memory port map(

                      addr =gt addr_s

                      clk =gt clk

                      dout =gt data_s

                      en =gt rd_s

                      sinit=gt rstn

                      )

                      end Behavioral

                      45

                      SYNCHRONIZATION BLOCK CODE

                      library IEEE

                      use IEEESTD_LOGIC_1164ALL

                      use IEEESTD_LOGIC_ARITHALL

                      use IEEESTD_LOGIC_UNSIGNEDALL

                      entity sync_unit is port(

                      clkin std_logic

                      rstnin std_logic

                      hsyncout std_logic ---horizontal synch pulse(row)

                      vsyncout std_logic --vertical sync

                      pulse(frame) 111110100000000000

                      v_en_regout std_logic

                      h_enout std_logic

                      v_enout std_logic

                      line_countout integer

                      c_hsout integer )

                      end sync_unit

                      architecture arch_sync_unit of sync_unit is

                      signal h_en_sv_en_sstd_logic

                      signal line_count_sc_hs_sinteger

                      begin

                      c_hs lt= c_hs_s

                      line_count lt= line_count_s

                      h_en lt= h_en_s

                      v_en lt= v_en_s

                      -- hsync counter

                      process(clkrstn)

                      begin

                      if(rstn = 0)then

                      46

                      c_hs_s lt= 0

                      else

                      if(clk = 1 and clkevent)then

                      if(c_hs_s=793)then

                      c_hs_s lt= 0

                      else

                      c_hs_s lt= c_hs_s +1

                      end if

                      end if

                      end if

                      end process

                      --vsync counter

                      --vertical line counter

                      process(h_en_srstn)

                      begin

                      if(rstn = 0)then

                      line_count_s lt= 0

                      elsif(h_en_s = 1 and h_en_sevent)then

                      line_count_s lt= line_count_s + 1

                      if (line_count_s=515)then

                      line_count_s lt= 0

                      end if

                      end if

                      --end if

                      end process

                      --hysnc pulse

                      process(clkrstn)

                      begin

                      if(rstn = 0)then

                      47

                      hsync lt= 1

                      elsif(clk = 1 and clkevent)then

                      if(c_hs_s lt=95)then

                      hsync lt= 0

                      else

                      hsync lt= 1

                      end if

                      end if

                      --end if

                      end process

                      process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                      this period)

                      begin

                      if(rstn = 0)then

                      h_en_s lt= 0

                      elsif(clk = 1 and clkevent)then

                      if((c_hs_s gt=143) and (c_hs_slt=778))then

                      h_en_s lt= 1

                      else

                      h_en_s lt= 0

                      end if

                      end if

                      --end if

                      end process

                      ---vysnc pulse

                      process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                      begin

                      if(rstn = 0)then

                      48

                      v_en_s lt= 0

                      elsif(clk = 1 and clkevent)then

                      if((line_count_sgt=35 and line_count_slt=515))then

                      v_en_s lt= 1

                      else

                      v_en_s lt= 0

                      end if

                      end if

                      --end if

                      end process

                      process(clkrstn) --Vertical Sync Front porch Back Porch

                      begin

                      if(rstn = 0)then

                      vsync lt= 1

                      elsif(clk = 1 and clkevent)then

                      if(line_count_s lt2)then

                      vsync lt= 0

                      else

                      vsync lt= 1

                      end if

                      end if

                      --end if

                      end process

                      process(clk)

                      begin

                      if(clk = 1 and clkevent)then

                      v_en_reglt=v_en_s

                      end if

                      end process

                      end arch_sync_unit

                      49

                      PIXEL GENERATOR

                      library IEEE

                      use IEEESTD_LOGIC_1164ALL

                      use IEEESTD_LOGIC_ARITHALL

                      use IEEESTD_LOGIC_UNSIGNEDALL

                      entity pixelgen is port(

                      clkin std_logic

                      rstnin std_logic

                      hsyncin std_logic

                      vsyncin std_logic

                      h_enin std_logic

                      v_enin std_logic

                      v_en_reg in std_logic

                      data_inin std_logic_vector(0 downto 0)

                      line_countin integer

                      c_hsin integer

                      red out std_logic_vector(0 downto 0)

                      greenout std_logic_vector(0 downto 0)

                      blueout std_logic_vector(0 downto 0)

                      addrout std_logic_vector(17 downto 0)

                      rdout std_logic

                      )

                      end pixelgen

                      architecture behv of pixelgen is

                      signal addr_sstd_logic_vector(17 downto 0)

                      begin

                      addr lt= addr_s

                      process(clkrstn)

                      begin

                      50

                      if(rstn = 0)then

                      rd lt= 0

                      red lt= 0

                      green lt= 0

                      blue lt= 0

                      addr_s lt= x0000 amp 00

                      elsif (clk = 1 and clkevent)then

                      if((v_en=1) and (v_en_reg=0))then

                      addr_slt=(others =gt0)

                      end if

                      if(v_en=1 and h_en=1)then

                      if(line_countlt=514)then

                      if(c_hs lt= 655)then

                      rd lt= 1

                      red lt= data_in

                      green lt= data_in

                      blue lt= data_in

                      addr_s lt=addr_s +000000000000000001

                      else

                      rd lt= 0

                      red lt= 0

                      green lt= 1

                      blue lt= 0

                      end if

                      else

                      rd lt= 0

                      red lt= 1

                      green lt= 1

                      blue lt= 1

                      51

                      end if

                      if(addr_s=111011111111111111)then

                      addr_slt= (others =gt0)

                      end if

                      else

                      red lt= 0

                      green lt= 0

                      blue lt= 0

                      end if

                      end if

                      -- end if

                      --end if

                      end process

                      end behv

                      52

                      12 FPGA ARCHITECTURE

                      A field-programmable gate array (FPGA) is a semiconductor device that can

                      be configured by the customer or designer after manufacturingmdashhence the

                      name field-programmable FPGAs are programmed using a logic circuit

                      diagram or a source code in a hardware description language (HDL) to specify

                      how the chip will work They can be used to implement any logical function

                      that an application-specific integrated circuit (ASIC) could perform but the

                      ability to update the functionality after shipping offers advantages for many

                      applications

                      FPGAs contain programmable logic components called logic blocks and a

                      hierarchy of reconfigurable interconnects that allow the blocks to be wired

                      togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                      can be configured to perform complex combinational functions or merely

                      simple logic gates like AND and XOR In most FPGAs the logic blocks also

                      include memory elements which may be simple flip-flops or more complete

                      blocks of memory

                      History The FPGA industry sprouted from programmable read only

                      memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                      both had the option of being programmed in batches in a factory or in the field

                      (field programmable) however programmable logic was hard-wired between

                      logic gates

                      Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                      first commercially viable field programmable gate array in 1985 ndash the

                      XC2064 The XC2064 had programmable gates and programmable

                      interconnects between gates the beginnings of a new technology and market

                      The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                      input lookup tables (LUTs) More than 20 years later Freeman was entered

                      into the National Inventors Hall of Fame for his invention

                      53

                      Some of the industryrsquos foundational concepts and technologies for

                      programmable logic arrays gates and logic blocks are founded in patents

                      awarded to David W Page and LuVerne R Peterson in 1985

                      In the late 1980s the Naval Surface Warfare Department funded an experiment

                      proposed by Steve Casselman to develop a computer that would implement

                      600000 reprogrammable gates Casselman was successful and the system was

                      awarded a patent in 1992

                      Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                      1990s when competitors sprouted up eroding significant market-share By

                      1993 Actel was serving about 18 percent of the market

                      The 1990s were an explosive period of time for FPGAs both in sophistication

                      and the volume of production In the early 1990s FPGAs were primarily used

                      in telecommunications and networking By the end of the decade FPGAs

                      found their way into consumer automotive and industrial applications

                      FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                      algorithm technology and FPGAs to create a sound recognition device

                      Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                      to decide the configuration needed to accomplish a sound recognition task

                      121 Modern developments

                      A recent trend has been to take the coarse-grained architectural

                      approach a step further by combining the logic blocks and interconnects of

                      traditional FPGAs with embedded microprocessors and related peripherals to

                      form a complete system on a programmable chip This work mirrors the

                      architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                      Group which combined a reconfigurable CPU architecture on a single chip

                      called the SB24 That work was done in 1982 Examples of such hybrid

                      technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                      which include one or more PowerPC processors embedded within the FPGAs

                      54

                      logic fabric The Atmel FPSLIC is another such device which uses an AVR

                      processor in combination with Atmels programmable logic architecture

                      An alternate approach to using hard-macro processors is to make use of soft

                      processor cores that are implemented within the FPGA logic (See Soft

                      processors below)

                      As previously mentioned many modern FPGAs have the ability to be

                      reprogrammed at run time and this is leading to the idea of reconfigurable

                      computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                      suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                      example of a reconfigurable soft processor implemented on FPGAs However

                      it does not support dynamic reconfiguration at runtime but instead adapts itself

                      to a specific program

                      Additionally new non-FPGA architectures are beginning to emerge

                      Software-configurable microprocessors such as the Stretch S5000 adopt a

                      hybrid approach by providing an array of processor cores and FPGA-like

                      programmable cores on the same chip

                      Gates

                      1987 9000 gates Xilinx

                      1992 600000 Naval Surface Warfare Department

                      Early 2000s Millions

                      Market size

                      1985 First commercial FPGA technology invented by Xilinx

                      1987 $14 million

                      ~1993 gt$385 million

                      2005 $19 billion

                      2010 estimates $275 billion

                      55

                      122 FPGA Comparisons

                      Historically FPGAs have been slower less energy efficient and generally

                      achieved less functionality than their fixed ASIC counterparts A combination

                      of volume fabrication improvements research and development and the IO

                      capabilities of new supercomputers have largely closed the performance gap

                      between ASICs and FPGAs

                      Advantages include a shorter time to market ability to re-program in the field

                      to fix bugs and lower non-recurring engineering costs Vendors can also take a

                      middle road by developing their hardware on ordinary FPGAs but

                      manufacture their final version so it can no longer be modified after the design

                      has been committed

                      Xilinx claims that several market and technology dynamics are changing the

                      ASICFPGA paradigm

                      IC costs are rising aggressively

                      ASIC complexity has bolstered development time and costs

                      RampD resources and headcount is decreasing

                      Revenue losses for slow time-to-market are increasing

                      Financial constraints in a poor economy are driving low-cost technologies

                      These trends make FPGAs a better alternative than ASICs for a growing

                      number of higher-volume applications than they have been historically used

                      for which the company blames for the growing number of FPGA design starts

                      (see History)

                      The primary differences between CPLDs and FPGAs are architectural A

                      CPLD has a somewhat restrictive structure consisting of one or more

                      programmable sum-of-products logic arrays feeding a relatively small number

                      of clocked registers The result of this is less flexibility with the advantage of

                      more predictable timing delays and a higher logic-to-interconnect ratio The

                      FPGA architectures on the other hand are dominated by interconnect This

                      56

                      makes them far more flexible (in terms of the range of designs that are

                      practical for implementation within them) but also far more complex to design

                      for

                      Another notable difference between CPLDs and FPGAs is the presence in most

                      FPGAs of higher-level embedded functions (such as adders and multipliers)

                      and embedded memories as well as to have logic blocks implement decoders

                      or mathematical functions

                      Some FPGAs have the capability of partial re-configuration that lets one

                      portion of the device be re-programmed while other portions continue running

                      123 Applications

                      Applications of FPGAs include digital signal processing software-defined

                      radio aerospace and defense systems ASIC prototyping medical imaging

                      computer vision speech recognition cryptography bioinformatics computer

                      hardware emulation radio astronomy and a growing range of other areas

                      FPGAs originally began as competitors to CPLDs and competed in a similar

                      space that of glue logic for PCBs As their size capabilities and speed

                      increased they began to take over larger and larger functions to the state where

                      some are now marketed as full systems on chips (SoC) Particularly with the

                      introduction of dedicated multipliers into FPGA architectures in the late 1990s

                      applications which had traditionally been the sole reserve of DSPs began to

                      incorporate FPGAs instead

                      FPGAs especially find applications in any area or algorithm that can make use

                      of the massive parallelism offered by their architecture One such area is code

                      breaking in particular brute-force attack of cryptographic algorithms

                      FPGAs are increasingly used in conventional high performance computing

                      applications where computational kernels such as FFT or Convolution are

                      performed on the FPGA instead of a microprocessor

                      57

                      The inherent parallelism of the logic resources on an FPGA allows for

                      considerable computational throughput even at a low MHz clock rates The

                      flexibility of the FPGA allows for even higher performance by trading off

                      precision and range in the number format for an increased number of parallel

                      arithmetic units This has driven a new type of processing called

                      reconfigurable computing where time intensive tasks are offloaded from

                      software to FPGAs

                      The adoption of FPGAs in high performance computing is currently limited by

                      the complexity of FPGA design compared to conventional software and the

                      extremely long turn-around times of current design tools where 4-8 hours wait

                      is necessary after even minor changes to the source code

                      Traditionally FPGAs have been reserved for specific vertical applications

                      where the volume of production is small For these low-volume applications

                      the premium that companies pay in hardware costs per unit for a

                      programmable chip is more affordable than the development resources spent

                      on creating an ASIC for a low-volume application Today new cost and

                      performance dynamics have broadened the range of viable applications

                      Architecture

                      The most common FPGA architecture consists of an array of configurable

                      logic blocks (CLBs) IO pads and routing channels Generally all the routing

                      channels have the same width (number of wires) Multiple IO pads may fit

                      into the height of one row or the width of one column in the array

                      An application circuit must be mapped into an FPGA with adequate resources

                      While the number of CLBs and IOs required is easily determined from the

                      design the number of routing tracks needed may vary considerably even

                      among designs with the same amount of logic (For example a crossbar switch

                      requires much more routing than a systolic array with the same gate count)

                      Since unused routing tracks increase the cost (and decrease the performance)

                      of the part without providing any benefit FPGA manufacturers try to provide

                      58

                      just enough tracks so that most designs that will fit in terms of LUTs and IOs

                      can be routed This is determined by estimates such as those derived from

                      Rents rule or by experiments with existing designs

                      The FPGA is an array or island-style FPGA It consists of an array of logic

                      blocks and routing channels Two IO pads fit into the height of one row or the

                      width of one column as shown below All the routing channels have the same

                      width (number of wires)

                      121 FPGA structure

                      A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                      flip-flop as shown below In recent years manufacturers have started moving

                      to 6-input LUTs in their high performance parts claiming increased

                      performance

                      Typical logic block

                      59

                      There is only one output which can be either the registered or the unregistered

                      LUT output The logic block has four inputs for the LUT and a clock input

                      Since clock signals (and often other high-fanout signals) are normally routed

                      via special-purpose dedicated routing networks in commercial FPGAs they

                      and other signals are separately managed

                      For this example architecture the locations of the FPGA logic block pins are

                      shown below

                      Logic Block Pin Locations

                      Each input is accessible from one side of the logic block while the output pin

                      can connect to routing wires in both the channel to the right and the channel

                      below the logic block

                      Each logic block output pin can connect to any of the wiring segments in the

                      channels adjacent to it

                      Similarly an IO pad can connect to any one of the wiring segments in the

                      channel adjacent to it For example an IO pad at the top of the chip can

                      connect to any of the W wires (where W is the channel width) in the horizontal

                      channel immediately below it

                      Generally the FPGA routing is unsegmented That is each wiring segment

                      spans only one logic block before it terminates in a switch box By turning on

                      60

                      some of the programmable switches within a switch box longer paths can be

                      constructed For higher speed interconnect some FPGA architectures use

                      longer routing lines that span multiple logic blocks

                      Whenever a vertical and a horizontal channel intersect there is a switch box In

                      this architecture when a wire enters a switch box there are three

                      programmable switches that allow it to connect to three other wires in adjacent

                      channel segments The pattern or topology of switches used in this

                      architecture is the planar or domain-based switch box topology In this switch

                      box topology a wire in track number one connects only to wires in track

                      number one in adjacent channel segments wires in track number 2 connect

                      only to other wires in track number 2 and so on The figure below illustrates

                      the connections in a switch box

                      Switch box topology

                      Modern FPGA families expand upon the above capabilities to include higher

                      level functionality fixed into the silicon Having these common functions

                      embedded into the silicon reduces the area required and gives those functions

                      increased speed compared to building them from primitives Examples of these

                      include multipliers generic DSP blocks embedded processors high speed IO

                      logic and embedded memories

                      61

                      FPGAs are also widely used for systems validation including pre-silicon

                      validation post-silicon validation and firmware development This allows chip

                      companies to validate their design before the chip is produced in the factory

                      reducing the time to market

                      124 FPGA Design and Programming

                      To define the behavior of the FPGA the user provides a hardware description

                      language (HDL) or a schematic design The HDL form might be easier to work

                      with when handling large structures because its possible to just specify them

                      numerically rather than having to draw every piece by hand On the other hand

                      schematic entry can allow for easier visualisation of a design

                      Then using an electronic design automation tool a technology-mapped netlist

                      is generated The netlist can then be fitted to the actual FPGA architecture

                      using a process called place-and-route usually performed by the FPGA

                      companys proprietary place-and-route software The user will validate the

                      map place and route results via timing analysis simulation and other

                      verification methodologies Once the design and validation process is

                      complete the binary file generated (also using the FPGA companys

                      proprietary software) is used to (re)configure the FPGA

                      Going from schematicHDL source files to actual configuration The source

                      files are fed to a software suite from the FPGACPLD vendor that through

                      different steps will produce a file This file is then transferred to the

                      FPGACPLD via a serial interface (JTAG) or to an external memory device

                      like an EEPROM

                      The most common HDLs are VHDL and Verilog although in an attempt to

                      reduce the complexity of designing in HDLs which have been compared to the

                      equivalent of assembly languages there are moves to raise the abstraction level

                      through the introduction of alternative languages

                      62

                      To simplify the design of complex systems in FPGAs there exist libraries of

                      predefined complex functions and circuits that have been tested and optimized

                      to speed up the design process These predefined circuits are commonly called

                      IP cores and are available from FPGA vendors and third-party IP suppliers

                      (rarely free and typically released under proprietary licenses) Other

                      predefined circuits are available from developer communities such as Open

                      Cores (typically free and released under the GPL BSD or similar license) and

                      other sources

                      In a typical design flow an FPGA application developer will simulate the

                      design at multiple stages throughout the design process Initially the RTL

                      description in VHDL or Verilog is simulated by creating test benches to

                      simulate the system and observe results Then after the synthesis engine has

                      mapped the design to a netlist the netlist is translated to a gate level

                      description where simulation is repeated to confirm the synthesis proceeded

                      without errors Finally the design is laid out in the FPGA at which point

                      propagation delays can be added and the simulation run again with these

                      values back-annotated onto the netlist

                      Basic Process Technology Types

                      SRAM - based on static memory technology In-system programmable and re-

                      programmable Requires external boot devices CMOS

                      Antifuse - One-time programmable CMOS

                      EPROM - Erasable Programmable Read-Only Memory technology Usually

                      one-time programmable in production because of plastic packaging

                      Windowed devices can be erased with ultraviolet (UV) light CMOS

                      EEPROM - Electrically Erasable Programmable Read-Only Memory

                      technology Can be erased even in plastic packages Some but not all

                      EEPROM devices can be in-system programmed CMOS

                      Flash - Flash-erase EPROM technology Can be erased even in plastic

                      packages Some but not all flash devices can be in-system programmed

                      63

                      Usually a flash cell is smaller than an equivalent EEPROM cell and is

                      therefore less expensive to manufacture CMOS

                      Fuse - One-time programmable Bipolar

                      Major Manufacturers

                      Xilinx and Altera are the current FPGA market leaders and long-time industry

                      rivals Together they control over 80 percent of the market with Xilinx alone

                      representing over 50 percent

                      Xilinx also provides free Windows and Linux design software while Altera

                      provides free Windows tools the Solaris and Linux tools are only available via

                      a rental scheme

                      Other competitors include Lattice Semiconductor (flash SRAM) Actel

                      (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                      Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                      focused CSSP no general purpose FPGAs)

                      125 FPGA prototype

                      FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                      prototyping is the method to prototype SoC and ASIC design on FPGA for

                      hardware verification and early software development

                      Main stream verification methods for hardware design and early software and

                      firmware co-design has become mainstream Prototyping SoC and ASIC

                      design on FPGA has become a good method to do this

                      64

                      Reasons why Prototyping is important

                      1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                      functionally correct This is compared to designers only relying on software

                      simulations to verify that their hardware design is sound Simulation speed and

                      modeling accuracy limitations hinder this development

                      2 Due to time constrains many projects cannot wait until the silicon is back from

                      the foundry to start on software tests FPGA prototyping allows for much more

                      time in area of software development and testing at the software-hardware

                      integration stage This allows many unforeseen software bugs that appear due

                      to todays array of operating systems applications and hardware

                      3 Prototyping also allows the developer to ensure that all IP technologies on his

                      system work well together off the simulation stage and in actual form

                      4 Prototyping has the added advantage as demo platforms to SoC clients

                      bringing in interest early This speeds up the overall development cycle and

                      allows for more enhancement or improvement to the chip features as it would

                      otherwise have been

                      65

                      13 SIMULATION RESULTS

                      66

                      GATE LEVEL

                      FigVGA CONTROLLER

                      67

                      GATE LEVEL

                      Fig SYNC UNIT

                      68

                      TECHNOLOGY SCHEMATIC

                      69

                      14 APPLICATIONS

                      Motion Tracking Systems

                      Image processing systems

                      Displaying Systems

                      141 CONCLUTION

                      Hence by implementing FPGA taking a processor-generated picture

                      (pixilated) from memory space the Controller provides digital RGB values for

                      each pixel as well as horizontal and vertical synchronization signals in order

                      to correctly display the picture on a connected monitor

                      70

                      REFERENCES

                      1 httpenwikipediaorgwikiVideo_Graphics_Array

                      2 A VGA display controller by Eduardo Sanchez

                      3 HinnerrdquoVGA Interface and video signal documentsrdquo

                      71

                      • Pixel
                        • 6 SIGNALS IN VGA CONTROLLER
                        • Gates
                        • Market size
                        • 122 FPGA Comparisons

                        The HSYNC signal is High (inactive) after an external reset signal

                        (RST) is received by the VGA Controller The signal is updated on each rising

                        edge of the external clock signal (CLK)

                        The state of the HSYNC signal depends on the value stored in the horizontal

                        counter and is driven low when

                        HCOUNT ge (PixTotal + BlankingLeft)

                        and remains low while

                        HCOUNT lt (PixTotal + BlankingLeft + TLSync)

                        Vertical (Frame) Period

                        63 Generation of the vertical synchronization signal - VSYNC

                        The VSYNC signal is High (inactive) after an external reset signal

                        (RST) is received by the VGA Controller The signal is updated after every

                        line of pixels is completed (ie on each rising edge of the HSYNC signal)

                        The state of the VSYNC signal depends on the value stored in the vertical

                        counter and is driven low when

                        VCOUNT ge (LinTotal + BlankingLeft)

                        and remains low while

                        VCOUNT lt (LinTotal + BlankingLeft + TFSync)

                        Address Counter (MEMADDR)

                        The address counter is used to store the position of the next consecutive

                        pixel in the frame Its value is passed to the Pixel Unit on the internal bus

                        signal hvcnt which is then used to provide the ADDR_PIXEL signal to obtain

                        the next pixel from picture memory

                        The counter is reset to zero when the VGA Controller receives an external

                        reset signal (RST) The size of the counter depends on the values chosen for

                        DISPSIZE_H and DISPSIZE_V as the range is simply

                        12

                        0 to (PixTotal x LinTotal) - 1

                        Taking the maximum number of pixels in a line and lines in a frame for each

                        of the supported display resolutions the maximum ranges for the counter are

                        bull 640x480 0 to 307199

                        bull 800x600 0 to 479999

                        The counter has 19-bit resolution

                        While the value in the horizontal counter (HCOUNT) is less than the

                        total number of viewable pixels in a line (PixTotal the integer value of

                        DISPSIZE_H) the counter is incremented on the rising edge of the external

                        clock signal (CLK) Pixel addressing within the frame is consecutive When

                        the counter reaches the last pixel in a line its incremented value is the first

                        pixel in the next line down

                        The address counter will continue to be incremented until the value in the

                        vertical counter (VCOUNT) is greater than or equal to the total number of

                        viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

                        point it will be rolled over to zero

                        13

                        7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

                        microcontroller and any VGA-compatible monitor This variant of the

                        Controller provides six modes of display depending on the resolution chosen

                        (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

                        White 16 Colors or 64 Colors)

                        71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

                        For 1sec = 60 frames

                        1 frame = max 160 sec

                        1 pixel = 160 sec = 50ns

                        307200

                        Max Time for 1 pixel = 40ns

                        2) To send information for each and every pixel of a frame by means of

                        RGB signals

                        3) After completely displaying 1 row information the column must be

                        refreshed and the row has to be incremented ie done by Hsync signal

                        4) After completing 1 frame of information a new frame has to be started

                        for this row and column are to be refreshed ie done by Vsync signal

                        14

                        =gt For giving information for 1 pixel we use 1 clock (for controlling all

                        operations)

                        1 clock = 1 pixel

                        1 clock cycle period = 40ns

                        Clock frequency = 1time period =140ns = 25 mhz

                        1 clock frequency = 25 mhz

                        So if we use 25mhz clock and display 1 pixel information for each and enery

                        clock then we use can attain min 60hz refreshing rate

                        do not use the clock frequency less than 25mhz

                        Ex

                        15MHz 25MHz

                        30MHz

                        Time period = more effectively utilized frequency

                        Less

                        No of framessec = less

                        More

                        Decreases

                        Increases

                        (It is unable to display 60framessec)

                        Signals used for image display-

                        R (Red)

                        15

                        G (Green)

                        B (Blue)

                        HSYNC (Horizontal Synchronous signal)

                        VSYNC (Vertical Synchronous signal)

                        All the above signals are used to control the image data to be correctly

                        displayed on the VGA display monitor

                        72 Timings for front port sync pulse back port-

                        Horizontal sync Vertical sync

                        Front port 094micros Front port 035ms

                        Sync pulse 377 micros Sync pulse 0063ms

                        Back port 189 micros Back port 102ms

                        16

                        8 FUNCTIONAL DESCRIPTION81 Symbol

                        VGA Controller Symbol

                        Pin description

                        The pin description is shown in the below table

                        Table VGA Pin description

                        Name Type Polarity Bus size Description

                        Global Control Signals

                        CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                        17

                        640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                        RST I High Global system reset

                        VGA Input Settings Signals

                        RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                        CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                        DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                        DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                        Data Memory Control Signals

                        RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                        18

                        the internal line enable signal en generated by the Synchronization Unit of the Controller

                        DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                        ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                        VGA Monitor Control Signals

                        HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                        VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                        19

                        VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                        R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                        G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                        S

                        B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                        20

                        9 VGA SIGNAL TIMING

                        Figure 3 summarizes the signal timing involved in sending a line of

                        pixels and a frame of lines The actual time values differ according to the

                        resolution selected (640x480 or 800x600) the processor-defined values for

                        DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                        (CLK)

                        21

                        91 VGA PIXEL UNIT

                        The Pixel Unit provides access to the pixilated image stored in external

                        picture memory reading in data a byte at a time and formatting each pixel to

                        be displayed For each pixel the 6-bit RGB value required for the control of

                        the monitorrsquos three primary color electron guns is generated so that the pixel

                        is displayed on the screen with the correct color

                        Accessing the stored image

                        The image to be displayed on the monitor screen is written by the host

                        microcontroller into external memory space (RAM) This memory space can

                        be located anywhere (eg using a block of RAM within the FPGA design or

                        using the SRAM on the NanoBoard itself)

                        Pixel data is stored in the memory space in bytes The number of pixels in a

                        byte depends on which color palette is being used for the image (selected by

                        the CMOD input)

                        bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                        bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                        bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                        (DATA[76] are not used)

                        22

                        The size of memory required to store a picture is determined by the total

                        number of viewable pixels in a line (determined by DISPSIZE_H) the total

                        number of viewable lines in a frame (determined by DISPSIZE_V) and the

                        number of pixels stored in each byte in memory space

                        Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                        The address in RAM where the next pixel is stored is determined using

                        an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                        reflects the current contents of the MEMADDR register The exact addressing

                        is described below

                        92 Black amp White mode

                        The picture memory address ndash the byte of data containing the next 8 pixels

                        ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                        three

                        ADDR_PIXEL = 00 amp hvcnt[183]

                        16 Colors mode

                        The picture memory address ndash the byte of data containing the next 2 pixels

                        ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                        one

                        ADDR_PIXEL = 0 amp hvcnt[181]

                        64 Colors mode

                        The picture memory address ndash the byte of data containing the next pixel ndash is

                        determined by using the full value of hvcnt

                        ADDR_PIXEL = hvcnt[180]

                        The Pixel register (PIXREG)

                        The Pixel register is used to receive the byte of pixel data read from the

                        current address in memory space The register PIXREG is reset to zero when

                        the VGA Controller receives an external reset signal (RST)

                        23

                        The register is updated on each rising edge of the CLK signal Data can be

                        read from the memory space as long as the RD signal is active (High) The RD

                        signal is itself controlled by the external line display enable signal enex This

                        internally generated signal is defined as follows

                        If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                        (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                        Else

                        enex = 1

                        When enex is Low read access from memory is disabled (RD = 0)

                        The point at which data is loaded from memory into PIXREG depends on the

                        particular color palette that is chosen ndash Black amp White 16 Colors or 64

                        Colors

                        Black amp White mode

                        The next byte of data will be loaded into the register whenever the

                        lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                        are ldquo000rdquo

                        For the currently loaded byte the active pixel is always in the lowest bit

                        position of the Pixel register Each pixel in the data byte is moved into this

                        active pixel position by shifting the contents of the register right by one bit on

                        each rising edge of CLK

                        16 Colors mode

                        The next byte of data will be loaded into the register whenever the

                        lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                        0

                        For the currently loaded byte the active pixel is always in the low order

                        nibble of the Pixel register Remember that in this mode each byte of data

                        contains two pixels The second pixel is moved into this active pixel position

                        by shifting the contents of the register right by four bits on the rising edge of

                        CLK

                        24

                        64 Colors mode

                        The next byte of data will be loaded into the register on the rising edge

                        of the external system clock signal (CLK) In this mode the read of pixel data

                        does not depend on the status of the hvcnt signal received from the

                        Synchronization Unit

                        93 The RGB register (RGBREG)

                        The RGB register is used to store the six bits that are required for

                        driving the red green and blue color guns of the target monitor When the

                        chosen color palette is either Black amp White or 16 Colors these six bits are

                        obtained by mapping the value of the active pixel to a predefined RGB code

                        When the chosen color palette is 64 Colors the actual pixel value is used

                        directly

                        The register RGBREG is reset to zero (000000) when the VGA

                        Controller receives an external reset signal (RST) This RGB code represents

                        black

                        The register is updated on each rising edge of the CLK signal and the

                        value loaded is dependent on the state of the line display enable signal en

                        When en is Low blanking is required and RGBREG is loaded with the code

                        for black (000000)

                        Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                        and a 4-bit pixel (16 Colors mode) into the required RGB color code

                        25

                        26

                        27

                        The RGB color code stored in the RGB register is output from the VGA

                        Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                        and B1)

                        The monitor itself expects analog signals as inputs to its electron gun control

                        circuits This is achieved by using 2-bit digital to analog converter circuitry

                        located on the Nano Board itself as shown in Figure 4

                        Figure 91 Digital to Analog RGB conversion

                        28

                        For each color the 2-bit digital signal from the VGA Controller can be

                        converted into 4 distinct analog levels These levels specify the intensity of

                        each of the three primary colors to use when displaying the pixel on the

                        monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                        brightness)

                        With each analog input being one of four possible levels the monitor

                        can display each pixel on the screen with one of 64 different color

                        permutations

                        Using Block RAM in Spartan-3E Generation

                        FPGAs

                        For applications requiring large on-chip memories Spartantrade-3

                        Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                        Using various configuration options Select RAM blocks create RAM ROM

                        FIFOs large look-up tables data width converterscircular buffers and shift

                        registers each supporting various data widths and depths This application note

                        describes the features and capabilities of block Select RAM and illustrates how

                        to specify the various options using the Xilinx CORE Generatortrade system or

                        via VHDL or Verilog instantiation

                        29

                        Table 2 Block RAM Available in Spartan-3E Devices

                        Devi

                        ce

                        R

                        A

                        M

                        Co

                        lu

                        m

                        ns

                        R

                        A

                        M

                        Bl

                        oc

                        ks

                        pe

                        r

                        C

                        ol

                        u

                        m

                        n

                        T

                        o

                        t

                        a

                        l

                        R

                        A

                        M

                        B

                        l

                        o

                        c

                        k

                        s

                        T

                        o

                        t

                        a

                        l

                        R

                        A

                        M

                        B

                        it

                        s

                        T

                        o

                        t

                        a

                        l

                        R

                        A

                        M

                        K

                        b

                        i

                        t

                        s

                        XC3

                        S100

                        E

                        1 4 4 7

                        3

                        7

                        2

                        8

                        7

                        2

                        XC3

                        S250

                        E

                        2 6 1

                        2

                        2

                        2

                        1

                        1

                        8

                        4

                        2

                        1

                        6

                        XC3

                        S500

                        E

                        2 10 2

                        0

                        3

                        6

                        8

                        6

                        4

                        3

                        6

                        0

                        30

                        0

                        XC3

                        S120

                        0E

                        2 14 2

                        8

                        5

                        1

                        6

                        0

                        9

                        6

                        5

                        0

                        4

                        XC3

                        S160

                        0E

                        2 18 3

                        6

                        6

                        6

                        3

                        5

                        5

                        2

                        6

                        4

                        8

                        Each block RAM contains 18432 bits of fast static RAM 16K bits of

                        which is allocated to data storage and in some memory configurations an

                        additional 2K bits allocated to parity or additional plus data bits Physically

                        the block RAM memory has two completely independent access ports labeled

                        Port A and Port B The structure is fully symmetrical and both ports are

                        interchangeable and both ports support data read and write operations Each

                        memory port is synchronous with its own clock clock enable and write

                        enable Read operations are also synchronous and require a clock edge and

                        clock enable Though physically a dual-port memory block RAM simulates

                        single-port memory in an application as shown in Figure 1 Furthermore each

                        block memory supports multiple configurations or aspect ratios Table 3

                        summarizes the essential SelectRAM features Cascade multiple block RAMs

                        to create deeper and wider memory organizations with a minimal timing

                        penalty incurred through specialized routing resources

                        31

                        32

                        10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                        Why (V) HDL

                        Interoperability

                        Technology independence

                        Design reuse

                        Several levels of abstraction

                        Readability

                        Standard language

                        Widely supported

                        What is VHDL

                        VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                        IC)

                        Design specification language

                        Design entry language

                        Design simulation language

                        Design documentation language

                        An alternative to schematics

                        101 Brief History

                        VHDL Was developed in the early 1980s for managing design problems that

                        involved large circuits and multiple teams of engineers

                        Funded by US Department of Defence

                        33

                        The first publicly available version was released in 1985

                        In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                        presented with a proposal to standardize the VHDL

                        In 1987 standardization =gt IEEE 1076-1987

                        An improved version of the language was released in 1994 =gt IEEE

                        standard1076-1993

                        102 Related Standards

                        IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                        impedance

                        Soon after IEEE 1076-1987 was released simulator companies began using

                        their own non-standard types =gt VHDL was becoming a nonstandard

                        IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                        definitions for a nine-valued data type std_logic

                        IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                        to actual hardware

                        Defines eg two numeric types signed and unsigned

                        VHDL Environment

                        Design Units

                        34

                        Segments of VHDL code that can be compiled separately and stored in a

                        library

                        Entities

                        A black box with interface definition

                        Defines the inputsoutputs of a component (define pins)

                        A way to represent modularity in VHDL

                        Similar to symbol in schematic

                        Entity declaration describes entity

                        Eg

                        entity Comparator is

                        port (A B in std_logic_vector(7 downto0)

                        EQ out std_logic)

                        end Comparator

                        Ports Provide channels of communication between the component and its

                        environment

                        Each port must have a name direction and a type

                        An entity may have NO port declaration

                        35

                        Port directions In A value of a port can be read inside the component but cannot be assigned

                        Multiple reads of port are allowed

                        Out Assignments can be made to a port but data from a port cannot be read

                        Multiple assignments are allowed

                        In out Bi-directional assignments can be made and data can be read Multiple

                        assignments are allowed

                        Buffer An out port with read capability May have at most one assignment

                        (are not recommended)

                        Architectures Every entity has at least one architecture

                        One entity can have several architectures

                        Architectures can describe design using

                        BehaviorndashStructurendashDataflow

                        Architectures can describe design on many levelsndashGate levelndashRTL (Register

                        Transfer Level)ndashBehavioral level

                        Configuration declaration links architecture to entity

                        Eg

                        Architecture Comparator1 of Comparator is

                        Begin

                        EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                        End Comparator1

                        Configurations Links entity declaration and architecture body together

                        Concept of default configuration is a bit messy in VHDL lsquo87

                        ndashLast architecture analyzed links to entity

                        Can be used to change simulation behavior without re-analyzing the VHDL

                        source

                        Complex configuration declarations are ignored in synthesis

                        36

                        Some entities can have eggate level architecture and behavioral

                        architecture

                        Are always optional

                        PackagesPackages contain information common to many design units

                        1 Package declaration

                        --constant declarations

                        ndashtype and subtype declarations

                        ndashfunction and procedure declarations

                        ndashglobal signal declarations

                        ndashfile declarations

                        ndashcomponent declarations

                        2 Package body

                        ndashis not necessary needed

                        ndashfunction bodies

                        ndashprocedure bodies

                        Packages are meant for encapsuling data which can be shared globally among

                        Several design units

                        Consists of declaration part and optional body part

                        Package declaration can contain

                        ndashtype and subtype declarations

                        ndashsubprograms

                        ndashconstants

                        Alias declarations

                        ndashglobal signal declarations

                        ndashfile declarations

                        ndashcomponent declarations

                        37

                        Package body consists of

                        ndashsubprogram declarations and bodies

                        ndashtype and subtype declarations

                        ndash deferred constants

                        ndash file declarations

                        LibrariesCollection of VHDL design units (database)

                        1 Packages

                        package declaration

                        package body

                        2 Entities (entity declaration)

                        3 Architectures (architecture body)

                        4 Configurations (configuration declarations)

                        Usually directory in UNIX file system

                        Can be also any other kind of database

                        Levels of Abstraction

                        VHDL supports many possible styles of design description which differ

                        primarily in how closely they relate to the HW

                        It is possible to describe a circuit in a number of ways

                        Structural-------

                        Dataflow ------- Higher level of abstraction

                        Behavioral -------

                        Structural VHDL description

                        Circuit is described in terms of its components

                        From a low-level description (eg transistor-level description) to a high level

                        description (eg block diagram)

                        For large circuits a low-level description quickly becomes impractical

                        103 Dataflow VHDL Description

                        38

                        Circuit is described in terms of how data moves through the system

                        In the dataflow style you describe how information flows between registers in

                        the system

                        The combinational logic is described at a relatively high level the placement

                        and operation of registers is specified quite precisely

                        The behavior of the system over the time is defined by registers

                        There are no build-in registers in VHDL-language

                        ndashEither lower level description

                        ndashor behavioral description of sequential elements is needed

                        The lower level register descriptions must be created or obtained

                        If there is no 3rd party models for registers =gt you must write the behavioral

                        description of registers

                        The behavioral description can be provided in the form of

                        subprograms(functions or procedures)

                        104 Behavioral VHDL Description

                        Circuit is described in terms of its operation over time

                        39

                        Representation might include eg state diagrams timing diagrams and

                        algorithmic descriptions

                        The concept of time may be expressed precisely using delays (eg A lt= B

                        after 10 ns)

                        If no actual delay is used order of sequential operations is defined

                        In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                        timing specifications

                        The actual timing results depend on implementation technology and efficiency

                        of synthesis tool

                        There are a few tools for behavioral synthesis

                        Concurrent Vs Sequential

                        Processes

                        Basic simulation concept in VHDL

                        VHDL description can always be broken up to interconnected processes

                        Quite similar to Unix process

                        40

                        Process keyword in VHDL

                        Process statement is concurrent statement

                        Statements inside process statements are sequential statements

                        Process must contain either sensitivity list or wait statement(s) but NOT both

                        Sensitivity list or wait statement(s) contains signals which wakes process up

                        General format

                        Process [(sensitivity_list)]

                        process_declarative_part

                        Begin

                        process_statements

                        [wait_statement]

                        End process

                        41

                        11 VGA CONTROLLER CODE

                        library IEEE

                        use IEEESTD_LOGIC_1164ALL

                        use IEEESTD_LOGIC_ARITHALL

                        use IEEESTD_LOGIC_UNSIGNEDALL

                        use IEEEnumeric_stdALL

                        ---- Uncomment the following library declaration if instantiating

                        ---- any Xilinx primitives in this code

                        --library UNISIM

                        --use UNISIMVComponentsall

                        entity vga_controller is port(clk50min std_logic

                        rstnin std_logic

                        hsyncout std_logic

                        vsyncout std_logic

                        redout std_logic_vector(0 downto 0)

                        greenout std_logic_vector(0 downto 0)

                        blueout std_logic_vector(0 downto 0))

                        end vga_controller

                        architecture Behavioral of vga_controller is

                        component memory

                        port (

                        addr IN std_logic_VECTOR(17 downto 0)

                        clk IN std_logic

                        dout OUT std_logic_VECTOR(0 downto 0)

                        en IN std_logic

                        sinit IN std_logic)

                        END component

                        42

                        component sync_unit port(

                        clkin std_logic

                        rstnin std_logic

                        hsyncout std_logic

                        ---horizontal synch pulse(row)

                        vsyncout std_logic

                        --vertical sync pulse(frame) 111110100000000000

                        v_en_regout std_logic

                        h_enout std_logic

                        v_enout std_logic

                        line_countinout integer

                        c_hsinout integer)

                        end component

                        component pixelgen port(

                        clkin std_logic

                        rstnin std_logic

                        hsyncin std_logic

                        vsyncin std_logic

                        h_enin std_logic

                        v_enin std_logic

                        v_en_reg in std_logic

                        data_inin std_logic_vector(0 downto 0)

                        line_countin integer

                        c_hsin integer

                        red out std_logic_vector(0 downto 0)

                        greenout std_logic_vector(0 downto 0)

                        blueout std_logic_vector(0 downto 0)

                        addrout std_logic_vector(17 downto 0)

                        rdout std_logic)

                        end component

                        43

                        signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                        signal line_count_sc_hs_saddrinteger

                        signal data_sstd_logic_vector(0 downto 0)

                        signal addr_sstd_logic_vector(17 downto 0)

                        begin

                        --addr_s lt= std_logic_vector(to_unsigned(addr18))

                        process(clk50mrstn)

                        begin

                        if(rstn = 0)then

                        clk lt= 0

                        elsif(clk50m = 1 and clk50mevent)then

                        clk lt= not clk

                        end if

                        --end if

                        end process

                        hsync lt= hsync_s

                        vsync lt= vsync_s

                        sync sync_unit port map(

                        clk =gt clk

                        rstn =gt rstn

                        hsync =gt hsync_s

                        vsync =gt vsync_s

                        v_en_reg =gt v_en_reg_s

                        line_count =gt line_count_s

                        h_en =gt h_en_s

                        v_en =gt v_en_s

                        c_hs =gt c_hs_s

                        )

                        44

                        pixel pixelgen port map(

                        clk =gt clk

                        rstn =gt rstn

                        hsync =gt hsync_s

                        vsync =gt vsync_s

                        h_en =gt h_en_s

                        v_en =gt v_en_s

                        v_en_reg =gt v_en_reg_s

                        data_in =gt data_s

                        line_count=gt line_count_s

                        c_hs =gt c_hs_s

                        red =gt red

                        green =gt green

                        blue =gt blue

                        addr =gt addr_s

                        rd =gt rd_s

                        )

                        mem12 memory port map(

                        addr =gt addr_s

                        clk =gt clk

                        dout =gt data_s

                        en =gt rd_s

                        sinit=gt rstn

                        )

                        end Behavioral

                        45

                        SYNCHRONIZATION BLOCK CODE

                        library IEEE

                        use IEEESTD_LOGIC_1164ALL

                        use IEEESTD_LOGIC_ARITHALL

                        use IEEESTD_LOGIC_UNSIGNEDALL

                        entity sync_unit is port(

                        clkin std_logic

                        rstnin std_logic

                        hsyncout std_logic ---horizontal synch pulse(row)

                        vsyncout std_logic --vertical sync

                        pulse(frame) 111110100000000000

                        v_en_regout std_logic

                        h_enout std_logic

                        v_enout std_logic

                        line_countout integer

                        c_hsout integer )

                        end sync_unit

                        architecture arch_sync_unit of sync_unit is

                        signal h_en_sv_en_sstd_logic

                        signal line_count_sc_hs_sinteger

                        begin

                        c_hs lt= c_hs_s

                        line_count lt= line_count_s

                        h_en lt= h_en_s

                        v_en lt= v_en_s

                        -- hsync counter

                        process(clkrstn)

                        begin

                        if(rstn = 0)then

                        46

                        c_hs_s lt= 0

                        else

                        if(clk = 1 and clkevent)then

                        if(c_hs_s=793)then

                        c_hs_s lt= 0

                        else

                        c_hs_s lt= c_hs_s +1

                        end if

                        end if

                        end if

                        end process

                        --vsync counter

                        --vertical line counter

                        process(h_en_srstn)

                        begin

                        if(rstn = 0)then

                        line_count_s lt= 0

                        elsif(h_en_s = 1 and h_en_sevent)then

                        line_count_s lt= line_count_s + 1

                        if (line_count_s=515)then

                        line_count_s lt= 0

                        end if

                        end if

                        --end if

                        end process

                        --hysnc pulse

                        process(clkrstn)

                        begin

                        if(rstn = 0)then

                        47

                        hsync lt= 1

                        elsif(clk = 1 and clkevent)then

                        if(c_hs_s lt=95)then

                        hsync lt= 0

                        else

                        hsync lt= 1

                        end if

                        end if

                        --end if

                        end process

                        process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                        this period)

                        begin

                        if(rstn = 0)then

                        h_en_s lt= 0

                        elsif(clk = 1 and clkevent)then

                        if((c_hs_s gt=143) and (c_hs_slt=778))then

                        h_en_s lt= 1

                        else

                        h_en_s lt= 0

                        end if

                        end if

                        --end if

                        end process

                        ---vysnc pulse

                        process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                        begin

                        if(rstn = 0)then

                        48

                        v_en_s lt= 0

                        elsif(clk = 1 and clkevent)then

                        if((line_count_sgt=35 and line_count_slt=515))then

                        v_en_s lt= 1

                        else

                        v_en_s lt= 0

                        end if

                        end if

                        --end if

                        end process

                        process(clkrstn) --Vertical Sync Front porch Back Porch

                        begin

                        if(rstn = 0)then

                        vsync lt= 1

                        elsif(clk = 1 and clkevent)then

                        if(line_count_s lt2)then

                        vsync lt= 0

                        else

                        vsync lt= 1

                        end if

                        end if

                        --end if

                        end process

                        process(clk)

                        begin

                        if(clk = 1 and clkevent)then

                        v_en_reglt=v_en_s

                        end if

                        end process

                        end arch_sync_unit

                        49

                        PIXEL GENERATOR

                        library IEEE

                        use IEEESTD_LOGIC_1164ALL

                        use IEEESTD_LOGIC_ARITHALL

                        use IEEESTD_LOGIC_UNSIGNEDALL

                        entity pixelgen is port(

                        clkin std_logic

                        rstnin std_logic

                        hsyncin std_logic

                        vsyncin std_logic

                        h_enin std_logic

                        v_enin std_logic

                        v_en_reg in std_logic

                        data_inin std_logic_vector(0 downto 0)

                        line_countin integer

                        c_hsin integer

                        red out std_logic_vector(0 downto 0)

                        greenout std_logic_vector(0 downto 0)

                        blueout std_logic_vector(0 downto 0)

                        addrout std_logic_vector(17 downto 0)

                        rdout std_logic

                        )

                        end pixelgen

                        architecture behv of pixelgen is

                        signal addr_sstd_logic_vector(17 downto 0)

                        begin

                        addr lt= addr_s

                        process(clkrstn)

                        begin

                        50

                        if(rstn = 0)then

                        rd lt= 0

                        red lt= 0

                        green lt= 0

                        blue lt= 0

                        addr_s lt= x0000 amp 00

                        elsif (clk = 1 and clkevent)then

                        if((v_en=1) and (v_en_reg=0))then

                        addr_slt=(others =gt0)

                        end if

                        if(v_en=1 and h_en=1)then

                        if(line_countlt=514)then

                        if(c_hs lt= 655)then

                        rd lt= 1

                        red lt= data_in

                        green lt= data_in

                        blue lt= data_in

                        addr_s lt=addr_s +000000000000000001

                        else

                        rd lt= 0

                        red lt= 0

                        green lt= 1

                        blue lt= 0

                        end if

                        else

                        rd lt= 0

                        red lt= 1

                        green lt= 1

                        blue lt= 1

                        51

                        end if

                        if(addr_s=111011111111111111)then

                        addr_slt= (others =gt0)

                        end if

                        else

                        red lt= 0

                        green lt= 0

                        blue lt= 0

                        end if

                        end if

                        -- end if

                        --end if

                        end process

                        end behv

                        52

                        12 FPGA ARCHITECTURE

                        A field-programmable gate array (FPGA) is a semiconductor device that can

                        be configured by the customer or designer after manufacturingmdashhence the

                        name field-programmable FPGAs are programmed using a logic circuit

                        diagram or a source code in a hardware description language (HDL) to specify

                        how the chip will work They can be used to implement any logical function

                        that an application-specific integrated circuit (ASIC) could perform but the

                        ability to update the functionality after shipping offers advantages for many

                        applications

                        FPGAs contain programmable logic components called logic blocks and a

                        hierarchy of reconfigurable interconnects that allow the blocks to be wired

                        togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                        can be configured to perform complex combinational functions or merely

                        simple logic gates like AND and XOR In most FPGAs the logic blocks also

                        include memory elements which may be simple flip-flops or more complete

                        blocks of memory

                        History The FPGA industry sprouted from programmable read only

                        memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                        both had the option of being programmed in batches in a factory or in the field

                        (field programmable) however programmable logic was hard-wired between

                        logic gates

                        Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                        first commercially viable field programmable gate array in 1985 ndash the

                        XC2064 The XC2064 had programmable gates and programmable

                        interconnects between gates the beginnings of a new technology and market

                        The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                        input lookup tables (LUTs) More than 20 years later Freeman was entered

                        into the National Inventors Hall of Fame for his invention

                        53

                        Some of the industryrsquos foundational concepts and technologies for

                        programmable logic arrays gates and logic blocks are founded in patents

                        awarded to David W Page and LuVerne R Peterson in 1985

                        In the late 1980s the Naval Surface Warfare Department funded an experiment

                        proposed by Steve Casselman to develop a computer that would implement

                        600000 reprogrammable gates Casselman was successful and the system was

                        awarded a patent in 1992

                        Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                        1990s when competitors sprouted up eroding significant market-share By

                        1993 Actel was serving about 18 percent of the market

                        The 1990s were an explosive period of time for FPGAs both in sophistication

                        and the volume of production In the early 1990s FPGAs were primarily used

                        in telecommunications and networking By the end of the decade FPGAs

                        found their way into consumer automotive and industrial applications

                        FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                        algorithm technology and FPGAs to create a sound recognition device

                        Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                        to decide the configuration needed to accomplish a sound recognition task

                        121 Modern developments

                        A recent trend has been to take the coarse-grained architectural

                        approach a step further by combining the logic blocks and interconnects of

                        traditional FPGAs with embedded microprocessors and related peripherals to

                        form a complete system on a programmable chip This work mirrors the

                        architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                        Group which combined a reconfigurable CPU architecture on a single chip

                        called the SB24 That work was done in 1982 Examples of such hybrid

                        technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                        which include one or more PowerPC processors embedded within the FPGAs

                        54

                        logic fabric The Atmel FPSLIC is another such device which uses an AVR

                        processor in combination with Atmels programmable logic architecture

                        An alternate approach to using hard-macro processors is to make use of soft

                        processor cores that are implemented within the FPGA logic (See Soft

                        processors below)

                        As previously mentioned many modern FPGAs have the ability to be

                        reprogrammed at run time and this is leading to the idea of reconfigurable

                        computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                        suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                        example of a reconfigurable soft processor implemented on FPGAs However

                        it does not support dynamic reconfiguration at runtime but instead adapts itself

                        to a specific program

                        Additionally new non-FPGA architectures are beginning to emerge

                        Software-configurable microprocessors such as the Stretch S5000 adopt a

                        hybrid approach by providing an array of processor cores and FPGA-like

                        programmable cores on the same chip

                        Gates

                        1987 9000 gates Xilinx

                        1992 600000 Naval Surface Warfare Department

                        Early 2000s Millions

                        Market size

                        1985 First commercial FPGA technology invented by Xilinx

                        1987 $14 million

                        ~1993 gt$385 million

                        2005 $19 billion

                        2010 estimates $275 billion

                        55

                        122 FPGA Comparisons

                        Historically FPGAs have been slower less energy efficient and generally

                        achieved less functionality than their fixed ASIC counterparts A combination

                        of volume fabrication improvements research and development and the IO

                        capabilities of new supercomputers have largely closed the performance gap

                        between ASICs and FPGAs

                        Advantages include a shorter time to market ability to re-program in the field

                        to fix bugs and lower non-recurring engineering costs Vendors can also take a

                        middle road by developing their hardware on ordinary FPGAs but

                        manufacture their final version so it can no longer be modified after the design

                        has been committed

                        Xilinx claims that several market and technology dynamics are changing the

                        ASICFPGA paradigm

                        IC costs are rising aggressively

                        ASIC complexity has bolstered development time and costs

                        RampD resources and headcount is decreasing

                        Revenue losses for slow time-to-market are increasing

                        Financial constraints in a poor economy are driving low-cost technologies

                        These trends make FPGAs a better alternative than ASICs for a growing

                        number of higher-volume applications than they have been historically used

                        for which the company blames for the growing number of FPGA design starts

                        (see History)

                        The primary differences between CPLDs and FPGAs are architectural A

                        CPLD has a somewhat restrictive structure consisting of one or more

                        programmable sum-of-products logic arrays feeding a relatively small number

                        of clocked registers The result of this is less flexibility with the advantage of

                        more predictable timing delays and a higher logic-to-interconnect ratio The

                        FPGA architectures on the other hand are dominated by interconnect This

                        56

                        makes them far more flexible (in terms of the range of designs that are

                        practical for implementation within them) but also far more complex to design

                        for

                        Another notable difference between CPLDs and FPGAs is the presence in most

                        FPGAs of higher-level embedded functions (such as adders and multipliers)

                        and embedded memories as well as to have logic blocks implement decoders

                        or mathematical functions

                        Some FPGAs have the capability of partial re-configuration that lets one

                        portion of the device be re-programmed while other portions continue running

                        123 Applications

                        Applications of FPGAs include digital signal processing software-defined

                        radio aerospace and defense systems ASIC prototyping medical imaging

                        computer vision speech recognition cryptography bioinformatics computer

                        hardware emulation radio astronomy and a growing range of other areas

                        FPGAs originally began as competitors to CPLDs and competed in a similar

                        space that of glue logic for PCBs As their size capabilities and speed

                        increased they began to take over larger and larger functions to the state where

                        some are now marketed as full systems on chips (SoC) Particularly with the

                        introduction of dedicated multipliers into FPGA architectures in the late 1990s

                        applications which had traditionally been the sole reserve of DSPs began to

                        incorporate FPGAs instead

                        FPGAs especially find applications in any area or algorithm that can make use

                        of the massive parallelism offered by their architecture One such area is code

                        breaking in particular brute-force attack of cryptographic algorithms

                        FPGAs are increasingly used in conventional high performance computing

                        applications where computational kernels such as FFT or Convolution are

                        performed on the FPGA instead of a microprocessor

                        57

                        The inherent parallelism of the logic resources on an FPGA allows for

                        considerable computational throughput even at a low MHz clock rates The

                        flexibility of the FPGA allows for even higher performance by trading off

                        precision and range in the number format for an increased number of parallel

                        arithmetic units This has driven a new type of processing called

                        reconfigurable computing where time intensive tasks are offloaded from

                        software to FPGAs

                        The adoption of FPGAs in high performance computing is currently limited by

                        the complexity of FPGA design compared to conventional software and the

                        extremely long turn-around times of current design tools where 4-8 hours wait

                        is necessary after even minor changes to the source code

                        Traditionally FPGAs have been reserved for specific vertical applications

                        where the volume of production is small For these low-volume applications

                        the premium that companies pay in hardware costs per unit for a

                        programmable chip is more affordable than the development resources spent

                        on creating an ASIC for a low-volume application Today new cost and

                        performance dynamics have broadened the range of viable applications

                        Architecture

                        The most common FPGA architecture consists of an array of configurable

                        logic blocks (CLBs) IO pads and routing channels Generally all the routing

                        channels have the same width (number of wires) Multiple IO pads may fit

                        into the height of one row or the width of one column in the array

                        An application circuit must be mapped into an FPGA with adequate resources

                        While the number of CLBs and IOs required is easily determined from the

                        design the number of routing tracks needed may vary considerably even

                        among designs with the same amount of logic (For example a crossbar switch

                        requires much more routing than a systolic array with the same gate count)

                        Since unused routing tracks increase the cost (and decrease the performance)

                        of the part without providing any benefit FPGA manufacturers try to provide

                        58

                        just enough tracks so that most designs that will fit in terms of LUTs and IOs

                        can be routed This is determined by estimates such as those derived from

                        Rents rule or by experiments with existing designs

                        The FPGA is an array or island-style FPGA It consists of an array of logic

                        blocks and routing channels Two IO pads fit into the height of one row or the

                        width of one column as shown below All the routing channels have the same

                        width (number of wires)

                        121 FPGA structure

                        A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                        flip-flop as shown below In recent years manufacturers have started moving

                        to 6-input LUTs in their high performance parts claiming increased

                        performance

                        Typical logic block

                        59

                        There is only one output which can be either the registered or the unregistered

                        LUT output The logic block has four inputs for the LUT and a clock input

                        Since clock signals (and often other high-fanout signals) are normally routed

                        via special-purpose dedicated routing networks in commercial FPGAs they

                        and other signals are separately managed

                        For this example architecture the locations of the FPGA logic block pins are

                        shown below

                        Logic Block Pin Locations

                        Each input is accessible from one side of the logic block while the output pin

                        can connect to routing wires in both the channel to the right and the channel

                        below the logic block

                        Each logic block output pin can connect to any of the wiring segments in the

                        channels adjacent to it

                        Similarly an IO pad can connect to any one of the wiring segments in the

                        channel adjacent to it For example an IO pad at the top of the chip can

                        connect to any of the W wires (where W is the channel width) in the horizontal

                        channel immediately below it

                        Generally the FPGA routing is unsegmented That is each wiring segment

                        spans only one logic block before it terminates in a switch box By turning on

                        60

                        some of the programmable switches within a switch box longer paths can be

                        constructed For higher speed interconnect some FPGA architectures use

                        longer routing lines that span multiple logic blocks

                        Whenever a vertical and a horizontal channel intersect there is a switch box In

                        this architecture when a wire enters a switch box there are three

                        programmable switches that allow it to connect to three other wires in adjacent

                        channel segments The pattern or topology of switches used in this

                        architecture is the planar or domain-based switch box topology In this switch

                        box topology a wire in track number one connects only to wires in track

                        number one in adjacent channel segments wires in track number 2 connect

                        only to other wires in track number 2 and so on The figure below illustrates

                        the connections in a switch box

                        Switch box topology

                        Modern FPGA families expand upon the above capabilities to include higher

                        level functionality fixed into the silicon Having these common functions

                        embedded into the silicon reduces the area required and gives those functions

                        increased speed compared to building them from primitives Examples of these

                        include multipliers generic DSP blocks embedded processors high speed IO

                        logic and embedded memories

                        61

                        FPGAs are also widely used for systems validation including pre-silicon

                        validation post-silicon validation and firmware development This allows chip

                        companies to validate their design before the chip is produced in the factory

                        reducing the time to market

                        124 FPGA Design and Programming

                        To define the behavior of the FPGA the user provides a hardware description

                        language (HDL) or a schematic design The HDL form might be easier to work

                        with when handling large structures because its possible to just specify them

                        numerically rather than having to draw every piece by hand On the other hand

                        schematic entry can allow for easier visualisation of a design

                        Then using an electronic design automation tool a technology-mapped netlist

                        is generated The netlist can then be fitted to the actual FPGA architecture

                        using a process called place-and-route usually performed by the FPGA

                        companys proprietary place-and-route software The user will validate the

                        map place and route results via timing analysis simulation and other

                        verification methodologies Once the design and validation process is

                        complete the binary file generated (also using the FPGA companys

                        proprietary software) is used to (re)configure the FPGA

                        Going from schematicHDL source files to actual configuration The source

                        files are fed to a software suite from the FPGACPLD vendor that through

                        different steps will produce a file This file is then transferred to the

                        FPGACPLD via a serial interface (JTAG) or to an external memory device

                        like an EEPROM

                        The most common HDLs are VHDL and Verilog although in an attempt to

                        reduce the complexity of designing in HDLs which have been compared to the

                        equivalent of assembly languages there are moves to raise the abstraction level

                        through the introduction of alternative languages

                        62

                        To simplify the design of complex systems in FPGAs there exist libraries of

                        predefined complex functions and circuits that have been tested and optimized

                        to speed up the design process These predefined circuits are commonly called

                        IP cores and are available from FPGA vendors and third-party IP suppliers

                        (rarely free and typically released under proprietary licenses) Other

                        predefined circuits are available from developer communities such as Open

                        Cores (typically free and released under the GPL BSD or similar license) and

                        other sources

                        In a typical design flow an FPGA application developer will simulate the

                        design at multiple stages throughout the design process Initially the RTL

                        description in VHDL or Verilog is simulated by creating test benches to

                        simulate the system and observe results Then after the synthesis engine has

                        mapped the design to a netlist the netlist is translated to a gate level

                        description where simulation is repeated to confirm the synthesis proceeded

                        without errors Finally the design is laid out in the FPGA at which point

                        propagation delays can be added and the simulation run again with these

                        values back-annotated onto the netlist

                        Basic Process Technology Types

                        SRAM - based on static memory technology In-system programmable and re-

                        programmable Requires external boot devices CMOS

                        Antifuse - One-time programmable CMOS

                        EPROM - Erasable Programmable Read-Only Memory technology Usually

                        one-time programmable in production because of plastic packaging

                        Windowed devices can be erased with ultraviolet (UV) light CMOS

                        EEPROM - Electrically Erasable Programmable Read-Only Memory

                        technology Can be erased even in plastic packages Some but not all

                        EEPROM devices can be in-system programmed CMOS

                        Flash - Flash-erase EPROM technology Can be erased even in plastic

                        packages Some but not all flash devices can be in-system programmed

                        63

                        Usually a flash cell is smaller than an equivalent EEPROM cell and is

                        therefore less expensive to manufacture CMOS

                        Fuse - One-time programmable Bipolar

                        Major Manufacturers

                        Xilinx and Altera are the current FPGA market leaders and long-time industry

                        rivals Together they control over 80 percent of the market with Xilinx alone

                        representing over 50 percent

                        Xilinx also provides free Windows and Linux design software while Altera

                        provides free Windows tools the Solaris and Linux tools are only available via

                        a rental scheme

                        Other competitors include Lattice Semiconductor (flash SRAM) Actel

                        (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                        Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                        focused CSSP no general purpose FPGAs)

                        125 FPGA prototype

                        FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                        prototyping is the method to prototype SoC and ASIC design on FPGA for

                        hardware verification and early software development

                        Main stream verification methods for hardware design and early software and

                        firmware co-design has become mainstream Prototyping SoC and ASIC

                        design on FPGA has become a good method to do this

                        64

                        Reasons why Prototyping is important

                        1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                        functionally correct This is compared to designers only relying on software

                        simulations to verify that their hardware design is sound Simulation speed and

                        modeling accuracy limitations hinder this development

                        2 Due to time constrains many projects cannot wait until the silicon is back from

                        the foundry to start on software tests FPGA prototyping allows for much more

                        time in area of software development and testing at the software-hardware

                        integration stage This allows many unforeseen software bugs that appear due

                        to todays array of operating systems applications and hardware

                        3 Prototyping also allows the developer to ensure that all IP technologies on his

                        system work well together off the simulation stage and in actual form

                        4 Prototyping has the added advantage as demo platforms to SoC clients

                        bringing in interest early This speeds up the overall development cycle and

                        allows for more enhancement or improvement to the chip features as it would

                        otherwise have been

                        65

                        13 SIMULATION RESULTS

                        66

                        GATE LEVEL

                        FigVGA CONTROLLER

                        67

                        GATE LEVEL

                        Fig SYNC UNIT

                        68

                        TECHNOLOGY SCHEMATIC

                        69

                        14 APPLICATIONS

                        Motion Tracking Systems

                        Image processing systems

                        Displaying Systems

                        141 CONCLUTION

                        Hence by implementing FPGA taking a processor-generated picture

                        (pixilated) from memory space the Controller provides digital RGB values for

                        each pixel as well as horizontal and vertical synchronization signals in order

                        to correctly display the picture on a connected monitor

                        70

                        REFERENCES

                        1 httpenwikipediaorgwikiVideo_Graphics_Array

                        2 A VGA display controller by Eduardo Sanchez

                        3 HinnerrdquoVGA Interface and video signal documentsrdquo

                        71

                        • Pixel
                          • 6 SIGNALS IN VGA CONTROLLER
                          • Gates
                          • Market size
                          • 122 FPGA Comparisons

                          0 to (PixTotal x LinTotal) - 1

                          Taking the maximum number of pixels in a line and lines in a frame for each

                          of the supported display resolutions the maximum ranges for the counter are

                          bull 640x480 0 to 307199

                          bull 800x600 0 to 479999

                          The counter has 19-bit resolution

                          While the value in the horizontal counter (HCOUNT) is less than the

                          total number of viewable pixels in a line (PixTotal the integer value of

                          DISPSIZE_H) the counter is incremented on the rising edge of the external

                          clock signal (CLK) Pixel addressing within the frame is consecutive When

                          the counter reaches the last pixel in a line its incremented value is the first

                          pixel in the next line down

                          The address counter will continue to be incremented until the value in the

                          vertical counter (VCOUNT) is greater than or equal to the total number of

                          viewable lines in a frame (LinTotal the integer value of DISPSIZE_V) At this

                          point it will be rolled over to zero

                          13

                          7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

                          microcontroller and any VGA-compatible monitor This variant of the

                          Controller provides six modes of display depending on the resolution chosen

                          (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

                          White 16 Colors or 64 Colors)

                          71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

                          For 1sec = 60 frames

                          1 frame = max 160 sec

                          1 pixel = 160 sec = 50ns

                          307200

                          Max Time for 1 pixel = 40ns

                          2) To send information for each and every pixel of a frame by means of

                          RGB signals

                          3) After completely displaying 1 row information the column must be

                          refreshed and the row has to be incremented ie done by Hsync signal

                          4) After completing 1 frame of information a new frame has to be started

                          for this row and column are to be refreshed ie done by Vsync signal

                          14

                          =gt For giving information for 1 pixel we use 1 clock (for controlling all

                          operations)

                          1 clock = 1 pixel

                          1 clock cycle period = 40ns

                          Clock frequency = 1time period =140ns = 25 mhz

                          1 clock frequency = 25 mhz

                          So if we use 25mhz clock and display 1 pixel information for each and enery

                          clock then we use can attain min 60hz refreshing rate

                          do not use the clock frequency less than 25mhz

                          Ex

                          15MHz 25MHz

                          30MHz

                          Time period = more effectively utilized frequency

                          Less

                          No of framessec = less

                          More

                          Decreases

                          Increases

                          (It is unable to display 60framessec)

                          Signals used for image display-

                          R (Red)

                          15

                          G (Green)

                          B (Blue)

                          HSYNC (Horizontal Synchronous signal)

                          VSYNC (Vertical Synchronous signal)

                          All the above signals are used to control the image data to be correctly

                          displayed on the VGA display monitor

                          72 Timings for front port sync pulse back port-

                          Horizontal sync Vertical sync

                          Front port 094micros Front port 035ms

                          Sync pulse 377 micros Sync pulse 0063ms

                          Back port 189 micros Back port 102ms

                          16

                          8 FUNCTIONAL DESCRIPTION81 Symbol

                          VGA Controller Symbol

                          Pin description

                          The pin description is shown in the below table

                          Table VGA Pin description

                          Name Type Polarity Bus size Description

                          Global Control Signals

                          CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                          17

                          640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                          RST I High Global system reset

                          VGA Input Settings Signals

                          RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                          CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                          DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                          DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                          Data Memory Control Signals

                          RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                          18

                          the internal line enable signal en generated by the Synchronization Unit of the Controller

                          DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                          ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                          VGA Monitor Control Signals

                          HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                          VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                          19

                          VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                          R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                          G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                          S

                          B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                          20

                          9 VGA SIGNAL TIMING

                          Figure 3 summarizes the signal timing involved in sending a line of

                          pixels and a frame of lines The actual time values differ according to the

                          resolution selected (640x480 or 800x600) the processor-defined values for

                          DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                          (CLK)

                          21

                          91 VGA PIXEL UNIT

                          The Pixel Unit provides access to the pixilated image stored in external

                          picture memory reading in data a byte at a time and formatting each pixel to

                          be displayed For each pixel the 6-bit RGB value required for the control of

                          the monitorrsquos three primary color electron guns is generated so that the pixel

                          is displayed on the screen with the correct color

                          Accessing the stored image

                          The image to be displayed on the monitor screen is written by the host

                          microcontroller into external memory space (RAM) This memory space can

                          be located anywhere (eg using a block of RAM within the FPGA design or

                          using the SRAM on the NanoBoard itself)

                          Pixel data is stored in the memory space in bytes The number of pixels in a

                          byte depends on which color palette is being used for the image (selected by

                          the CMOD input)

                          bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                          bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                          bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                          (DATA[76] are not used)

                          22

                          The size of memory required to store a picture is determined by the total

                          number of viewable pixels in a line (determined by DISPSIZE_H) the total

                          number of viewable lines in a frame (determined by DISPSIZE_V) and the

                          number of pixels stored in each byte in memory space

                          Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                          The address in RAM where the next pixel is stored is determined using

                          an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                          reflects the current contents of the MEMADDR register The exact addressing

                          is described below

                          92 Black amp White mode

                          The picture memory address ndash the byte of data containing the next 8 pixels

                          ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                          three

                          ADDR_PIXEL = 00 amp hvcnt[183]

                          16 Colors mode

                          The picture memory address ndash the byte of data containing the next 2 pixels

                          ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                          one

                          ADDR_PIXEL = 0 amp hvcnt[181]

                          64 Colors mode

                          The picture memory address ndash the byte of data containing the next pixel ndash is

                          determined by using the full value of hvcnt

                          ADDR_PIXEL = hvcnt[180]

                          The Pixel register (PIXREG)

                          The Pixel register is used to receive the byte of pixel data read from the

                          current address in memory space The register PIXREG is reset to zero when

                          the VGA Controller receives an external reset signal (RST)

                          23

                          The register is updated on each rising edge of the CLK signal Data can be

                          read from the memory space as long as the RD signal is active (High) The RD

                          signal is itself controlled by the external line display enable signal enex This

                          internally generated signal is defined as follows

                          If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                          (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                          Else

                          enex = 1

                          When enex is Low read access from memory is disabled (RD = 0)

                          The point at which data is loaded from memory into PIXREG depends on the

                          particular color palette that is chosen ndash Black amp White 16 Colors or 64

                          Colors

                          Black amp White mode

                          The next byte of data will be loaded into the register whenever the

                          lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                          are ldquo000rdquo

                          For the currently loaded byte the active pixel is always in the lowest bit

                          position of the Pixel register Each pixel in the data byte is moved into this

                          active pixel position by shifting the contents of the register right by one bit on

                          each rising edge of CLK

                          16 Colors mode

                          The next byte of data will be loaded into the register whenever the

                          lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                          0

                          For the currently loaded byte the active pixel is always in the low order

                          nibble of the Pixel register Remember that in this mode each byte of data

                          contains two pixels The second pixel is moved into this active pixel position

                          by shifting the contents of the register right by four bits on the rising edge of

                          CLK

                          24

                          64 Colors mode

                          The next byte of data will be loaded into the register on the rising edge

                          of the external system clock signal (CLK) In this mode the read of pixel data

                          does not depend on the status of the hvcnt signal received from the

                          Synchronization Unit

                          93 The RGB register (RGBREG)

                          The RGB register is used to store the six bits that are required for

                          driving the red green and blue color guns of the target monitor When the

                          chosen color palette is either Black amp White or 16 Colors these six bits are

                          obtained by mapping the value of the active pixel to a predefined RGB code

                          When the chosen color palette is 64 Colors the actual pixel value is used

                          directly

                          The register RGBREG is reset to zero (000000) when the VGA

                          Controller receives an external reset signal (RST) This RGB code represents

                          black

                          The register is updated on each rising edge of the CLK signal and the

                          value loaded is dependent on the state of the line display enable signal en

                          When en is Low blanking is required and RGBREG is loaded with the code

                          for black (000000)

                          Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                          and a 4-bit pixel (16 Colors mode) into the required RGB color code

                          25

                          26

                          27

                          The RGB color code stored in the RGB register is output from the VGA

                          Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                          and B1)

                          The monitor itself expects analog signals as inputs to its electron gun control

                          circuits This is achieved by using 2-bit digital to analog converter circuitry

                          located on the Nano Board itself as shown in Figure 4

                          Figure 91 Digital to Analog RGB conversion

                          28

                          For each color the 2-bit digital signal from the VGA Controller can be

                          converted into 4 distinct analog levels These levels specify the intensity of

                          each of the three primary colors to use when displaying the pixel on the

                          monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                          brightness)

                          With each analog input being one of four possible levels the monitor

                          can display each pixel on the screen with one of 64 different color

                          permutations

                          Using Block RAM in Spartan-3E Generation

                          FPGAs

                          For applications requiring large on-chip memories Spartantrade-3

                          Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                          Using various configuration options Select RAM blocks create RAM ROM

                          FIFOs large look-up tables data width converterscircular buffers and shift

                          registers each supporting various data widths and depths This application note

                          describes the features and capabilities of block Select RAM and illustrates how

                          to specify the various options using the Xilinx CORE Generatortrade system or

                          via VHDL or Verilog instantiation

                          29

                          Table 2 Block RAM Available in Spartan-3E Devices

                          Devi

                          ce

                          R

                          A

                          M

                          Co

                          lu

                          m

                          ns

                          R

                          A

                          M

                          Bl

                          oc

                          ks

                          pe

                          r

                          C

                          ol

                          u

                          m

                          n

                          T

                          o

                          t

                          a

                          l

                          R

                          A

                          M

                          B

                          l

                          o

                          c

                          k

                          s

                          T

                          o

                          t

                          a

                          l

                          R

                          A

                          M

                          B

                          it

                          s

                          T

                          o

                          t

                          a

                          l

                          R

                          A

                          M

                          K

                          b

                          i

                          t

                          s

                          XC3

                          S100

                          E

                          1 4 4 7

                          3

                          7

                          2

                          8

                          7

                          2

                          XC3

                          S250

                          E

                          2 6 1

                          2

                          2

                          2

                          1

                          1

                          8

                          4

                          2

                          1

                          6

                          XC3

                          S500

                          E

                          2 10 2

                          0

                          3

                          6

                          8

                          6

                          4

                          3

                          6

                          0

                          30

                          0

                          XC3

                          S120

                          0E

                          2 14 2

                          8

                          5

                          1

                          6

                          0

                          9

                          6

                          5

                          0

                          4

                          XC3

                          S160

                          0E

                          2 18 3

                          6

                          6

                          6

                          3

                          5

                          5

                          2

                          6

                          4

                          8

                          Each block RAM contains 18432 bits of fast static RAM 16K bits of

                          which is allocated to data storage and in some memory configurations an

                          additional 2K bits allocated to parity or additional plus data bits Physically

                          the block RAM memory has two completely independent access ports labeled

                          Port A and Port B The structure is fully symmetrical and both ports are

                          interchangeable and both ports support data read and write operations Each

                          memory port is synchronous with its own clock clock enable and write

                          enable Read operations are also synchronous and require a clock edge and

                          clock enable Though physically a dual-port memory block RAM simulates

                          single-port memory in an application as shown in Figure 1 Furthermore each

                          block memory supports multiple configurations or aspect ratios Table 3

                          summarizes the essential SelectRAM features Cascade multiple block RAMs

                          to create deeper and wider memory organizations with a minimal timing

                          penalty incurred through specialized routing resources

                          31

                          32

                          10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                          Why (V) HDL

                          Interoperability

                          Technology independence

                          Design reuse

                          Several levels of abstraction

                          Readability

                          Standard language

                          Widely supported

                          What is VHDL

                          VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                          IC)

                          Design specification language

                          Design entry language

                          Design simulation language

                          Design documentation language

                          An alternative to schematics

                          101 Brief History

                          VHDL Was developed in the early 1980s for managing design problems that

                          involved large circuits and multiple teams of engineers

                          Funded by US Department of Defence

                          33

                          The first publicly available version was released in 1985

                          In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                          presented with a proposal to standardize the VHDL

                          In 1987 standardization =gt IEEE 1076-1987

                          An improved version of the language was released in 1994 =gt IEEE

                          standard1076-1993

                          102 Related Standards

                          IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                          impedance

                          Soon after IEEE 1076-1987 was released simulator companies began using

                          their own non-standard types =gt VHDL was becoming a nonstandard

                          IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                          definitions for a nine-valued data type std_logic

                          IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                          to actual hardware

                          Defines eg two numeric types signed and unsigned

                          VHDL Environment

                          Design Units

                          34

                          Segments of VHDL code that can be compiled separately and stored in a

                          library

                          Entities

                          A black box with interface definition

                          Defines the inputsoutputs of a component (define pins)

                          A way to represent modularity in VHDL

                          Similar to symbol in schematic

                          Entity declaration describes entity

                          Eg

                          entity Comparator is

                          port (A B in std_logic_vector(7 downto0)

                          EQ out std_logic)

                          end Comparator

                          Ports Provide channels of communication between the component and its

                          environment

                          Each port must have a name direction and a type

                          An entity may have NO port declaration

                          35

                          Port directions In A value of a port can be read inside the component but cannot be assigned

                          Multiple reads of port are allowed

                          Out Assignments can be made to a port but data from a port cannot be read

                          Multiple assignments are allowed

                          In out Bi-directional assignments can be made and data can be read Multiple

                          assignments are allowed

                          Buffer An out port with read capability May have at most one assignment

                          (are not recommended)

                          Architectures Every entity has at least one architecture

                          One entity can have several architectures

                          Architectures can describe design using

                          BehaviorndashStructurendashDataflow

                          Architectures can describe design on many levelsndashGate levelndashRTL (Register

                          Transfer Level)ndashBehavioral level

                          Configuration declaration links architecture to entity

                          Eg

                          Architecture Comparator1 of Comparator is

                          Begin

                          EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                          End Comparator1

                          Configurations Links entity declaration and architecture body together

                          Concept of default configuration is a bit messy in VHDL lsquo87

                          ndashLast architecture analyzed links to entity

                          Can be used to change simulation behavior without re-analyzing the VHDL

                          source

                          Complex configuration declarations are ignored in synthesis

                          36

                          Some entities can have eggate level architecture and behavioral

                          architecture

                          Are always optional

                          PackagesPackages contain information common to many design units

                          1 Package declaration

                          --constant declarations

                          ndashtype and subtype declarations

                          ndashfunction and procedure declarations

                          ndashglobal signal declarations

                          ndashfile declarations

                          ndashcomponent declarations

                          2 Package body

                          ndashis not necessary needed

                          ndashfunction bodies

                          ndashprocedure bodies

                          Packages are meant for encapsuling data which can be shared globally among

                          Several design units

                          Consists of declaration part and optional body part

                          Package declaration can contain

                          ndashtype and subtype declarations

                          ndashsubprograms

                          ndashconstants

                          Alias declarations

                          ndashglobal signal declarations

                          ndashfile declarations

                          ndashcomponent declarations

                          37

                          Package body consists of

                          ndashsubprogram declarations and bodies

                          ndashtype and subtype declarations

                          ndash deferred constants

                          ndash file declarations

                          LibrariesCollection of VHDL design units (database)

                          1 Packages

                          package declaration

                          package body

                          2 Entities (entity declaration)

                          3 Architectures (architecture body)

                          4 Configurations (configuration declarations)

                          Usually directory in UNIX file system

                          Can be also any other kind of database

                          Levels of Abstraction

                          VHDL supports many possible styles of design description which differ

                          primarily in how closely they relate to the HW

                          It is possible to describe a circuit in a number of ways

                          Structural-------

                          Dataflow ------- Higher level of abstraction

                          Behavioral -------

                          Structural VHDL description

                          Circuit is described in terms of its components

                          From a low-level description (eg transistor-level description) to a high level

                          description (eg block diagram)

                          For large circuits a low-level description quickly becomes impractical

                          103 Dataflow VHDL Description

                          38

                          Circuit is described in terms of how data moves through the system

                          In the dataflow style you describe how information flows between registers in

                          the system

                          The combinational logic is described at a relatively high level the placement

                          and operation of registers is specified quite precisely

                          The behavior of the system over the time is defined by registers

                          There are no build-in registers in VHDL-language

                          ndashEither lower level description

                          ndashor behavioral description of sequential elements is needed

                          The lower level register descriptions must be created or obtained

                          If there is no 3rd party models for registers =gt you must write the behavioral

                          description of registers

                          The behavioral description can be provided in the form of

                          subprograms(functions or procedures)

                          104 Behavioral VHDL Description

                          Circuit is described in terms of its operation over time

                          39

                          Representation might include eg state diagrams timing diagrams and

                          algorithmic descriptions

                          The concept of time may be expressed precisely using delays (eg A lt= B

                          after 10 ns)

                          If no actual delay is used order of sequential operations is defined

                          In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                          timing specifications

                          The actual timing results depend on implementation technology and efficiency

                          of synthesis tool

                          There are a few tools for behavioral synthesis

                          Concurrent Vs Sequential

                          Processes

                          Basic simulation concept in VHDL

                          VHDL description can always be broken up to interconnected processes

                          Quite similar to Unix process

                          40

                          Process keyword in VHDL

                          Process statement is concurrent statement

                          Statements inside process statements are sequential statements

                          Process must contain either sensitivity list or wait statement(s) but NOT both

                          Sensitivity list or wait statement(s) contains signals which wakes process up

                          General format

                          Process [(sensitivity_list)]

                          process_declarative_part

                          Begin

                          process_statements

                          [wait_statement]

                          End process

                          41

                          11 VGA CONTROLLER CODE

                          library IEEE

                          use IEEESTD_LOGIC_1164ALL

                          use IEEESTD_LOGIC_ARITHALL

                          use IEEESTD_LOGIC_UNSIGNEDALL

                          use IEEEnumeric_stdALL

                          ---- Uncomment the following library declaration if instantiating

                          ---- any Xilinx primitives in this code

                          --library UNISIM

                          --use UNISIMVComponentsall

                          entity vga_controller is port(clk50min std_logic

                          rstnin std_logic

                          hsyncout std_logic

                          vsyncout std_logic

                          redout std_logic_vector(0 downto 0)

                          greenout std_logic_vector(0 downto 0)

                          blueout std_logic_vector(0 downto 0))

                          end vga_controller

                          architecture Behavioral of vga_controller is

                          component memory

                          port (

                          addr IN std_logic_VECTOR(17 downto 0)

                          clk IN std_logic

                          dout OUT std_logic_VECTOR(0 downto 0)

                          en IN std_logic

                          sinit IN std_logic)

                          END component

                          42

                          component sync_unit port(

                          clkin std_logic

                          rstnin std_logic

                          hsyncout std_logic

                          ---horizontal synch pulse(row)

                          vsyncout std_logic

                          --vertical sync pulse(frame) 111110100000000000

                          v_en_regout std_logic

                          h_enout std_logic

                          v_enout std_logic

                          line_countinout integer

                          c_hsinout integer)

                          end component

                          component pixelgen port(

                          clkin std_logic

                          rstnin std_logic

                          hsyncin std_logic

                          vsyncin std_logic

                          h_enin std_logic

                          v_enin std_logic

                          v_en_reg in std_logic

                          data_inin std_logic_vector(0 downto 0)

                          line_countin integer

                          c_hsin integer

                          red out std_logic_vector(0 downto 0)

                          greenout std_logic_vector(0 downto 0)

                          blueout std_logic_vector(0 downto 0)

                          addrout std_logic_vector(17 downto 0)

                          rdout std_logic)

                          end component

                          43

                          signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                          signal line_count_sc_hs_saddrinteger

                          signal data_sstd_logic_vector(0 downto 0)

                          signal addr_sstd_logic_vector(17 downto 0)

                          begin

                          --addr_s lt= std_logic_vector(to_unsigned(addr18))

                          process(clk50mrstn)

                          begin

                          if(rstn = 0)then

                          clk lt= 0

                          elsif(clk50m = 1 and clk50mevent)then

                          clk lt= not clk

                          end if

                          --end if

                          end process

                          hsync lt= hsync_s

                          vsync lt= vsync_s

                          sync sync_unit port map(

                          clk =gt clk

                          rstn =gt rstn

                          hsync =gt hsync_s

                          vsync =gt vsync_s

                          v_en_reg =gt v_en_reg_s

                          line_count =gt line_count_s

                          h_en =gt h_en_s

                          v_en =gt v_en_s

                          c_hs =gt c_hs_s

                          )

                          44

                          pixel pixelgen port map(

                          clk =gt clk

                          rstn =gt rstn

                          hsync =gt hsync_s

                          vsync =gt vsync_s

                          h_en =gt h_en_s

                          v_en =gt v_en_s

                          v_en_reg =gt v_en_reg_s

                          data_in =gt data_s

                          line_count=gt line_count_s

                          c_hs =gt c_hs_s

                          red =gt red

                          green =gt green

                          blue =gt blue

                          addr =gt addr_s

                          rd =gt rd_s

                          )

                          mem12 memory port map(

                          addr =gt addr_s

                          clk =gt clk

                          dout =gt data_s

                          en =gt rd_s

                          sinit=gt rstn

                          )

                          end Behavioral

                          45

                          SYNCHRONIZATION BLOCK CODE

                          library IEEE

                          use IEEESTD_LOGIC_1164ALL

                          use IEEESTD_LOGIC_ARITHALL

                          use IEEESTD_LOGIC_UNSIGNEDALL

                          entity sync_unit is port(

                          clkin std_logic

                          rstnin std_logic

                          hsyncout std_logic ---horizontal synch pulse(row)

                          vsyncout std_logic --vertical sync

                          pulse(frame) 111110100000000000

                          v_en_regout std_logic

                          h_enout std_logic

                          v_enout std_logic

                          line_countout integer

                          c_hsout integer )

                          end sync_unit

                          architecture arch_sync_unit of sync_unit is

                          signal h_en_sv_en_sstd_logic

                          signal line_count_sc_hs_sinteger

                          begin

                          c_hs lt= c_hs_s

                          line_count lt= line_count_s

                          h_en lt= h_en_s

                          v_en lt= v_en_s

                          -- hsync counter

                          process(clkrstn)

                          begin

                          if(rstn = 0)then

                          46

                          c_hs_s lt= 0

                          else

                          if(clk = 1 and clkevent)then

                          if(c_hs_s=793)then

                          c_hs_s lt= 0

                          else

                          c_hs_s lt= c_hs_s +1

                          end if

                          end if

                          end if

                          end process

                          --vsync counter

                          --vertical line counter

                          process(h_en_srstn)

                          begin

                          if(rstn = 0)then

                          line_count_s lt= 0

                          elsif(h_en_s = 1 and h_en_sevent)then

                          line_count_s lt= line_count_s + 1

                          if (line_count_s=515)then

                          line_count_s lt= 0

                          end if

                          end if

                          --end if

                          end process

                          --hysnc pulse

                          process(clkrstn)

                          begin

                          if(rstn = 0)then

                          47

                          hsync lt= 1

                          elsif(clk = 1 and clkevent)then

                          if(c_hs_s lt=95)then

                          hsync lt= 0

                          else

                          hsync lt= 1

                          end if

                          end if

                          --end if

                          end process

                          process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                          this period)

                          begin

                          if(rstn = 0)then

                          h_en_s lt= 0

                          elsif(clk = 1 and clkevent)then

                          if((c_hs_s gt=143) and (c_hs_slt=778))then

                          h_en_s lt= 1

                          else

                          h_en_s lt= 0

                          end if

                          end if

                          --end if

                          end process

                          ---vysnc pulse

                          process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                          begin

                          if(rstn = 0)then

                          48

                          v_en_s lt= 0

                          elsif(clk = 1 and clkevent)then

                          if((line_count_sgt=35 and line_count_slt=515))then

                          v_en_s lt= 1

                          else

                          v_en_s lt= 0

                          end if

                          end if

                          --end if

                          end process

                          process(clkrstn) --Vertical Sync Front porch Back Porch

                          begin

                          if(rstn = 0)then

                          vsync lt= 1

                          elsif(clk = 1 and clkevent)then

                          if(line_count_s lt2)then

                          vsync lt= 0

                          else

                          vsync lt= 1

                          end if

                          end if

                          --end if

                          end process

                          process(clk)

                          begin

                          if(clk = 1 and clkevent)then

                          v_en_reglt=v_en_s

                          end if

                          end process

                          end arch_sync_unit

                          49

                          PIXEL GENERATOR

                          library IEEE

                          use IEEESTD_LOGIC_1164ALL

                          use IEEESTD_LOGIC_ARITHALL

                          use IEEESTD_LOGIC_UNSIGNEDALL

                          entity pixelgen is port(

                          clkin std_logic

                          rstnin std_logic

                          hsyncin std_logic

                          vsyncin std_logic

                          h_enin std_logic

                          v_enin std_logic

                          v_en_reg in std_logic

                          data_inin std_logic_vector(0 downto 0)

                          line_countin integer

                          c_hsin integer

                          red out std_logic_vector(0 downto 0)

                          greenout std_logic_vector(0 downto 0)

                          blueout std_logic_vector(0 downto 0)

                          addrout std_logic_vector(17 downto 0)

                          rdout std_logic

                          )

                          end pixelgen

                          architecture behv of pixelgen is

                          signal addr_sstd_logic_vector(17 downto 0)

                          begin

                          addr lt= addr_s

                          process(clkrstn)

                          begin

                          50

                          if(rstn = 0)then

                          rd lt= 0

                          red lt= 0

                          green lt= 0

                          blue lt= 0

                          addr_s lt= x0000 amp 00

                          elsif (clk = 1 and clkevent)then

                          if((v_en=1) and (v_en_reg=0))then

                          addr_slt=(others =gt0)

                          end if

                          if(v_en=1 and h_en=1)then

                          if(line_countlt=514)then

                          if(c_hs lt= 655)then

                          rd lt= 1

                          red lt= data_in

                          green lt= data_in

                          blue lt= data_in

                          addr_s lt=addr_s +000000000000000001

                          else

                          rd lt= 0

                          red lt= 0

                          green lt= 1

                          blue lt= 0

                          end if

                          else

                          rd lt= 0

                          red lt= 1

                          green lt= 1

                          blue lt= 1

                          51

                          end if

                          if(addr_s=111011111111111111)then

                          addr_slt= (others =gt0)

                          end if

                          else

                          red lt= 0

                          green lt= 0

                          blue lt= 0

                          end if

                          end if

                          -- end if

                          --end if

                          end process

                          end behv

                          52

                          12 FPGA ARCHITECTURE

                          A field-programmable gate array (FPGA) is a semiconductor device that can

                          be configured by the customer or designer after manufacturingmdashhence the

                          name field-programmable FPGAs are programmed using a logic circuit

                          diagram or a source code in a hardware description language (HDL) to specify

                          how the chip will work They can be used to implement any logical function

                          that an application-specific integrated circuit (ASIC) could perform but the

                          ability to update the functionality after shipping offers advantages for many

                          applications

                          FPGAs contain programmable logic components called logic blocks and a

                          hierarchy of reconfigurable interconnects that allow the blocks to be wired

                          togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                          can be configured to perform complex combinational functions or merely

                          simple logic gates like AND and XOR In most FPGAs the logic blocks also

                          include memory elements which may be simple flip-flops or more complete

                          blocks of memory

                          History The FPGA industry sprouted from programmable read only

                          memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                          both had the option of being programmed in batches in a factory or in the field

                          (field programmable) however programmable logic was hard-wired between

                          logic gates

                          Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                          first commercially viable field programmable gate array in 1985 ndash the

                          XC2064 The XC2064 had programmable gates and programmable

                          interconnects between gates the beginnings of a new technology and market

                          The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                          input lookup tables (LUTs) More than 20 years later Freeman was entered

                          into the National Inventors Hall of Fame for his invention

                          53

                          Some of the industryrsquos foundational concepts and technologies for

                          programmable logic arrays gates and logic blocks are founded in patents

                          awarded to David W Page and LuVerne R Peterson in 1985

                          In the late 1980s the Naval Surface Warfare Department funded an experiment

                          proposed by Steve Casselman to develop a computer that would implement

                          600000 reprogrammable gates Casselman was successful and the system was

                          awarded a patent in 1992

                          Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                          1990s when competitors sprouted up eroding significant market-share By

                          1993 Actel was serving about 18 percent of the market

                          The 1990s were an explosive period of time for FPGAs both in sophistication

                          and the volume of production In the early 1990s FPGAs were primarily used

                          in telecommunications and networking By the end of the decade FPGAs

                          found their way into consumer automotive and industrial applications

                          FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                          algorithm technology and FPGAs to create a sound recognition device

                          Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                          to decide the configuration needed to accomplish a sound recognition task

                          121 Modern developments

                          A recent trend has been to take the coarse-grained architectural

                          approach a step further by combining the logic blocks and interconnects of

                          traditional FPGAs with embedded microprocessors and related peripherals to

                          form a complete system on a programmable chip This work mirrors the

                          architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                          Group which combined a reconfigurable CPU architecture on a single chip

                          called the SB24 That work was done in 1982 Examples of such hybrid

                          technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                          which include one or more PowerPC processors embedded within the FPGAs

                          54

                          logic fabric The Atmel FPSLIC is another such device which uses an AVR

                          processor in combination with Atmels programmable logic architecture

                          An alternate approach to using hard-macro processors is to make use of soft

                          processor cores that are implemented within the FPGA logic (See Soft

                          processors below)

                          As previously mentioned many modern FPGAs have the ability to be

                          reprogrammed at run time and this is leading to the idea of reconfigurable

                          computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                          suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                          example of a reconfigurable soft processor implemented on FPGAs However

                          it does not support dynamic reconfiguration at runtime but instead adapts itself

                          to a specific program

                          Additionally new non-FPGA architectures are beginning to emerge

                          Software-configurable microprocessors such as the Stretch S5000 adopt a

                          hybrid approach by providing an array of processor cores and FPGA-like

                          programmable cores on the same chip

                          Gates

                          1987 9000 gates Xilinx

                          1992 600000 Naval Surface Warfare Department

                          Early 2000s Millions

                          Market size

                          1985 First commercial FPGA technology invented by Xilinx

                          1987 $14 million

                          ~1993 gt$385 million

                          2005 $19 billion

                          2010 estimates $275 billion

                          55

                          122 FPGA Comparisons

                          Historically FPGAs have been slower less energy efficient and generally

                          achieved less functionality than their fixed ASIC counterparts A combination

                          of volume fabrication improvements research and development and the IO

                          capabilities of new supercomputers have largely closed the performance gap

                          between ASICs and FPGAs

                          Advantages include a shorter time to market ability to re-program in the field

                          to fix bugs and lower non-recurring engineering costs Vendors can also take a

                          middle road by developing their hardware on ordinary FPGAs but

                          manufacture their final version so it can no longer be modified after the design

                          has been committed

                          Xilinx claims that several market and technology dynamics are changing the

                          ASICFPGA paradigm

                          IC costs are rising aggressively

                          ASIC complexity has bolstered development time and costs

                          RampD resources and headcount is decreasing

                          Revenue losses for slow time-to-market are increasing

                          Financial constraints in a poor economy are driving low-cost technologies

                          These trends make FPGAs a better alternative than ASICs for a growing

                          number of higher-volume applications than they have been historically used

                          for which the company blames for the growing number of FPGA design starts

                          (see History)

                          The primary differences between CPLDs and FPGAs are architectural A

                          CPLD has a somewhat restrictive structure consisting of one or more

                          programmable sum-of-products logic arrays feeding a relatively small number

                          of clocked registers The result of this is less flexibility with the advantage of

                          more predictable timing delays and a higher logic-to-interconnect ratio The

                          FPGA architectures on the other hand are dominated by interconnect This

                          56

                          makes them far more flexible (in terms of the range of designs that are

                          practical for implementation within them) but also far more complex to design

                          for

                          Another notable difference between CPLDs and FPGAs is the presence in most

                          FPGAs of higher-level embedded functions (such as adders and multipliers)

                          and embedded memories as well as to have logic blocks implement decoders

                          or mathematical functions

                          Some FPGAs have the capability of partial re-configuration that lets one

                          portion of the device be re-programmed while other portions continue running

                          123 Applications

                          Applications of FPGAs include digital signal processing software-defined

                          radio aerospace and defense systems ASIC prototyping medical imaging

                          computer vision speech recognition cryptography bioinformatics computer

                          hardware emulation radio astronomy and a growing range of other areas

                          FPGAs originally began as competitors to CPLDs and competed in a similar

                          space that of glue logic for PCBs As their size capabilities and speed

                          increased they began to take over larger and larger functions to the state where

                          some are now marketed as full systems on chips (SoC) Particularly with the

                          introduction of dedicated multipliers into FPGA architectures in the late 1990s

                          applications which had traditionally been the sole reserve of DSPs began to

                          incorporate FPGAs instead

                          FPGAs especially find applications in any area or algorithm that can make use

                          of the massive parallelism offered by their architecture One such area is code

                          breaking in particular brute-force attack of cryptographic algorithms

                          FPGAs are increasingly used in conventional high performance computing

                          applications where computational kernels such as FFT or Convolution are

                          performed on the FPGA instead of a microprocessor

                          57

                          The inherent parallelism of the logic resources on an FPGA allows for

                          considerable computational throughput even at a low MHz clock rates The

                          flexibility of the FPGA allows for even higher performance by trading off

                          precision and range in the number format for an increased number of parallel

                          arithmetic units This has driven a new type of processing called

                          reconfigurable computing where time intensive tasks are offloaded from

                          software to FPGAs

                          The adoption of FPGAs in high performance computing is currently limited by

                          the complexity of FPGA design compared to conventional software and the

                          extremely long turn-around times of current design tools where 4-8 hours wait

                          is necessary after even minor changes to the source code

                          Traditionally FPGAs have been reserved for specific vertical applications

                          where the volume of production is small For these low-volume applications

                          the premium that companies pay in hardware costs per unit for a

                          programmable chip is more affordable than the development resources spent

                          on creating an ASIC for a low-volume application Today new cost and

                          performance dynamics have broadened the range of viable applications

                          Architecture

                          The most common FPGA architecture consists of an array of configurable

                          logic blocks (CLBs) IO pads and routing channels Generally all the routing

                          channels have the same width (number of wires) Multiple IO pads may fit

                          into the height of one row or the width of one column in the array

                          An application circuit must be mapped into an FPGA with adequate resources

                          While the number of CLBs and IOs required is easily determined from the

                          design the number of routing tracks needed may vary considerably even

                          among designs with the same amount of logic (For example a crossbar switch

                          requires much more routing than a systolic array with the same gate count)

                          Since unused routing tracks increase the cost (and decrease the performance)

                          of the part without providing any benefit FPGA manufacturers try to provide

                          58

                          just enough tracks so that most designs that will fit in terms of LUTs and IOs

                          can be routed This is determined by estimates such as those derived from

                          Rents rule or by experiments with existing designs

                          The FPGA is an array or island-style FPGA It consists of an array of logic

                          blocks and routing channels Two IO pads fit into the height of one row or the

                          width of one column as shown below All the routing channels have the same

                          width (number of wires)

                          121 FPGA structure

                          A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                          flip-flop as shown below In recent years manufacturers have started moving

                          to 6-input LUTs in their high performance parts claiming increased

                          performance

                          Typical logic block

                          59

                          There is only one output which can be either the registered or the unregistered

                          LUT output The logic block has four inputs for the LUT and a clock input

                          Since clock signals (and often other high-fanout signals) are normally routed

                          via special-purpose dedicated routing networks in commercial FPGAs they

                          and other signals are separately managed

                          For this example architecture the locations of the FPGA logic block pins are

                          shown below

                          Logic Block Pin Locations

                          Each input is accessible from one side of the logic block while the output pin

                          can connect to routing wires in both the channel to the right and the channel

                          below the logic block

                          Each logic block output pin can connect to any of the wiring segments in the

                          channels adjacent to it

                          Similarly an IO pad can connect to any one of the wiring segments in the

                          channel adjacent to it For example an IO pad at the top of the chip can

                          connect to any of the W wires (where W is the channel width) in the horizontal

                          channel immediately below it

                          Generally the FPGA routing is unsegmented That is each wiring segment

                          spans only one logic block before it terminates in a switch box By turning on

                          60

                          some of the programmable switches within a switch box longer paths can be

                          constructed For higher speed interconnect some FPGA architectures use

                          longer routing lines that span multiple logic blocks

                          Whenever a vertical and a horizontal channel intersect there is a switch box In

                          this architecture when a wire enters a switch box there are three

                          programmable switches that allow it to connect to three other wires in adjacent

                          channel segments The pattern or topology of switches used in this

                          architecture is the planar or domain-based switch box topology In this switch

                          box topology a wire in track number one connects only to wires in track

                          number one in adjacent channel segments wires in track number 2 connect

                          only to other wires in track number 2 and so on The figure below illustrates

                          the connections in a switch box

                          Switch box topology

                          Modern FPGA families expand upon the above capabilities to include higher

                          level functionality fixed into the silicon Having these common functions

                          embedded into the silicon reduces the area required and gives those functions

                          increased speed compared to building them from primitives Examples of these

                          include multipliers generic DSP blocks embedded processors high speed IO

                          logic and embedded memories

                          61

                          FPGAs are also widely used for systems validation including pre-silicon

                          validation post-silicon validation and firmware development This allows chip

                          companies to validate their design before the chip is produced in the factory

                          reducing the time to market

                          124 FPGA Design and Programming

                          To define the behavior of the FPGA the user provides a hardware description

                          language (HDL) or a schematic design The HDL form might be easier to work

                          with when handling large structures because its possible to just specify them

                          numerically rather than having to draw every piece by hand On the other hand

                          schematic entry can allow for easier visualisation of a design

                          Then using an electronic design automation tool a technology-mapped netlist

                          is generated The netlist can then be fitted to the actual FPGA architecture

                          using a process called place-and-route usually performed by the FPGA

                          companys proprietary place-and-route software The user will validate the

                          map place and route results via timing analysis simulation and other

                          verification methodologies Once the design and validation process is

                          complete the binary file generated (also using the FPGA companys

                          proprietary software) is used to (re)configure the FPGA

                          Going from schematicHDL source files to actual configuration The source

                          files are fed to a software suite from the FPGACPLD vendor that through

                          different steps will produce a file This file is then transferred to the

                          FPGACPLD via a serial interface (JTAG) or to an external memory device

                          like an EEPROM

                          The most common HDLs are VHDL and Verilog although in an attempt to

                          reduce the complexity of designing in HDLs which have been compared to the

                          equivalent of assembly languages there are moves to raise the abstraction level

                          through the introduction of alternative languages

                          62

                          To simplify the design of complex systems in FPGAs there exist libraries of

                          predefined complex functions and circuits that have been tested and optimized

                          to speed up the design process These predefined circuits are commonly called

                          IP cores and are available from FPGA vendors and third-party IP suppliers

                          (rarely free and typically released under proprietary licenses) Other

                          predefined circuits are available from developer communities such as Open

                          Cores (typically free and released under the GPL BSD or similar license) and

                          other sources

                          In a typical design flow an FPGA application developer will simulate the

                          design at multiple stages throughout the design process Initially the RTL

                          description in VHDL or Verilog is simulated by creating test benches to

                          simulate the system and observe results Then after the synthesis engine has

                          mapped the design to a netlist the netlist is translated to a gate level

                          description where simulation is repeated to confirm the synthesis proceeded

                          without errors Finally the design is laid out in the FPGA at which point

                          propagation delays can be added and the simulation run again with these

                          values back-annotated onto the netlist

                          Basic Process Technology Types

                          SRAM - based on static memory technology In-system programmable and re-

                          programmable Requires external boot devices CMOS

                          Antifuse - One-time programmable CMOS

                          EPROM - Erasable Programmable Read-Only Memory technology Usually

                          one-time programmable in production because of plastic packaging

                          Windowed devices can be erased with ultraviolet (UV) light CMOS

                          EEPROM - Electrically Erasable Programmable Read-Only Memory

                          technology Can be erased even in plastic packages Some but not all

                          EEPROM devices can be in-system programmed CMOS

                          Flash - Flash-erase EPROM technology Can be erased even in plastic

                          packages Some but not all flash devices can be in-system programmed

                          63

                          Usually a flash cell is smaller than an equivalent EEPROM cell and is

                          therefore less expensive to manufacture CMOS

                          Fuse - One-time programmable Bipolar

                          Major Manufacturers

                          Xilinx and Altera are the current FPGA market leaders and long-time industry

                          rivals Together they control over 80 percent of the market with Xilinx alone

                          representing over 50 percent

                          Xilinx also provides free Windows and Linux design software while Altera

                          provides free Windows tools the Solaris and Linux tools are only available via

                          a rental scheme

                          Other competitors include Lattice Semiconductor (flash SRAM) Actel

                          (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                          Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                          focused CSSP no general purpose FPGAs)

                          125 FPGA prototype

                          FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                          prototyping is the method to prototype SoC and ASIC design on FPGA for

                          hardware verification and early software development

                          Main stream verification methods for hardware design and early software and

                          firmware co-design has become mainstream Prototyping SoC and ASIC

                          design on FPGA has become a good method to do this

                          64

                          Reasons why Prototyping is important

                          1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                          functionally correct This is compared to designers only relying on software

                          simulations to verify that their hardware design is sound Simulation speed and

                          modeling accuracy limitations hinder this development

                          2 Due to time constrains many projects cannot wait until the silicon is back from

                          the foundry to start on software tests FPGA prototyping allows for much more

                          time in area of software development and testing at the software-hardware

                          integration stage This allows many unforeseen software bugs that appear due

                          to todays array of operating systems applications and hardware

                          3 Prototyping also allows the developer to ensure that all IP technologies on his

                          system work well together off the simulation stage and in actual form

                          4 Prototyping has the added advantage as demo platforms to SoC clients

                          bringing in interest early This speeds up the overall development cycle and

                          allows for more enhancement or improvement to the chip features as it would

                          otherwise have been

                          65

                          13 SIMULATION RESULTS

                          66

                          GATE LEVEL

                          FigVGA CONTROLLER

                          67

                          GATE LEVEL

                          Fig SYNC UNIT

                          68

                          TECHNOLOGY SCHEMATIC

                          69

                          14 APPLICATIONS

                          Motion Tracking Systems

                          Image processing systems

                          Displaying Systems

                          141 CONCLUTION

                          Hence by implementing FPGA taking a processor-generated picture

                          (pixilated) from memory space the Controller provides digital RGB values for

                          each pixel as well as horizontal and vertical synchronization signals in order

                          to correctly display the picture on a connected monitor

                          70

                          REFERENCES

                          1 httpenwikipediaorgwikiVideo_Graphics_Array

                          2 A VGA display controller by Eduardo Sanchez

                          3 HinnerrdquoVGA Interface and video signal documentsrdquo

                          71

                          • Pixel
                            • 6 SIGNALS IN VGA CONTROLLER
                            • Gates
                            • Market size
                            • 122 FPGA Comparisons

                            7 VGA Controller The VGA Controller provides a simple 8-bit interface between a host

                            microcontroller and any VGA-compatible monitor This variant of the

                            Controller provides six modes of display depending on the resolution chosen

                            (640x480 (VGA) or 800x600 (SVGA)) and the color palette (either Black amp

                            White 16 Colors or 64 Colors)

                            71 VGA Controller Tasks-1) To display the frames at minimum 60hz refreshing rate

                            For 1sec = 60 frames

                            1 frame = max 160 sec

                            1 pixel = 160 sec = 50ns

                            307200

                            Max Time for 1 pixel = 40ns

                            2) To send information for each and every pixel of a frame by means of

                            RGB signals

                            3) After completely displaying 1 row information the column must be

                            refreshed and the row has to be incremented ie done by Hsync signal

                            4) After completing 1 frame of information a new frame has to be started

                            for this row and column are to be refreshed ie done by Vsync signal

                            14

                            =gt For giving information for 1 pixel we use 1 clock (for controlling all

                            operations)

                            1 clock = 1 pixel

                            1 clock cycle period = 40ns

                            Clock frequency = 1time period =140ns = 25 mhz

                            1 clock frequency = 25 mhz

                            So if we use 25mhz clock and display 1 pixel information for each and enery

                            clock then we use can attain min 60hz refreshing rate

                            do not use the clock frequency less than 25mhz

                            Ex

                            15MHz 25MHz

                            30MHz

                            Time period = more effectively utilized frequency

                            Less

                            No of framessec = less

                            More

                            Decreases

                            Increases

                            (It is unable to display 60framessec)

                            Signals used for image display-

                            R (Red)

                            15

                            G (Green)

                            B (Blue)

                            HSYNC (Horizontal Synchronous signal)

                            VSYNC (Vertical Synchronous signal)

                            All the above signals are used to control the image data to be correctly

                            displayed on the VGA display monitor

                            72 Timings for front port sync pulse back port-

                            Horizontal sync Vertical sync

                            Front port 094micros Front port 035ms

                            Sync pulse 377 micros Sync pulse 0063ms

                            Back port 189 micros Back port 102ms

                            16

                            8 FUNCTIONAL DESCRIPTION81 Symbol

                            VGA Controller Symbol

                            Pin description

                            The pin description is shown in the below table

                            Table VGA Pin description

                            Name Type Polarity Bus size Description

                            Global Control Signals

                            CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                            17

                            640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                            RST I High Global system reset

                            VGA Input Settings Signals

                            RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                            CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                            DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                            DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                            Data Memory Control Signals

                            RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                            18

                            the internal line enable signal en generated by the Synchronization Unit of the Controller

                            DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                            ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                            VGA Monitor Control Signals

                            HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                            VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                            19

                            VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                            R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                            G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                            S

                            B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                            20

                            9 VGA SIGNAL TIMING

                            Figure 3 summarizes the signal timing involved in sending a line of

                            pixels and a frame of lines The actual time values differ according to the

                            resolution selected (640x480 or 800x600) the processor-defined values for

                            DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                            (CLK)

                            21

                            91 VGA PIXEL UNIT

                            The Pixel Unit provides access to the pixilated image stored in external

                            picture memory reading in data a byte at a time and formatting each pixel to

                            be displayed For each pixel the 6-bit RGB value required for the control of

                            the monitorrsquos three primary color electron guns is generated so that the pixel

                            is displayed on the screen with the correct color

                            Accessing the stored image

                            The image to be displayed on the monitor screen is written by the host

                            microcontroller into external memory space (RAM) This memory space can

                            be located anywhere (eg using a block of RAM within the FPGA design or

                            using the SRAM on the NanoBoard itself)

                            Pixel data is stored in the memory space in bytes The number of pixels in a

                            byte depends on which color palette is being used for the image (selected by

                            the CMOD input)

                            bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                            bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                            bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                            (DATA[76] are not used)

                            22

                            The size of memory required to store a picture is determined by the total

                            number of viewable pixels in a line (determined by DISPSIZE_H) the total

                            number of viewable lines in a frame (determined by DISPSIZE_V) and the

                            number of pixels stored in each byte in memory space

                            Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                            The address in RAM where the next pixel is stored is determined using

                            an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                            reflects the current contents of the MEMADDR register The exact addressing

                            is described below

                            92 Black amp White mode

                            The picture memory address ndash the byte of data containing the next 8 pixels

                            ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                            three

                            ADDR_PIXEL = 00 amp hvcnt[183]

                            16 Colors mode

                            The picture memory address ndash the byte of data containing the next 2 pixels

                            ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                            one

                            ADDR_PIXEL = 0 amp hvcnt[181]

                            64 Colors mode

                            The picture memory address ndash the byte of data containing the next pixel ndash is

                            determined by using the full value of hvcnt

                            ADDR_PIXEL = hvcnt[180]

                            The Pixel register (PIXREG)

                            The Pixel register is used to receive the byte of pixel data read from the

                            current address in memory space The register PIXREG is reset to zero when

                            the VGA Controller receives an external reset signal (RST)

                            23

                            The register is updated on each rising edge of the CLK signal Data can be

                            read from the memory space as long as the RD signal is active (High) The RD

                            signal is itself controlled by the external line display enable signal enex This

                            internally generated signal is defined as follows

                            If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                            (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                            Else

                            enex = 1

                            When enex is Low read access from memory is disabled (RD = 0)

                            The point at which data is loaded from memory into PIXREG depends on the

                            particular color palette that is chosen ndash Black amp White 16 Colors or 64

                            Colors

                            Black amp White mode

                            The next byte of data will be loaded into the register whenever the

                            lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                            are ldquo000rdquo

                            For the currently loaded byte the active pixel is always in the lowest bit

                            position of the Pixel register Each pixel in the data byte is moved into this

                            active pixel position by shifting the contents of the register right by one bit on

                            each rising edge of CLK

                            16 Colors mode

                            The next byte of data will be loaded into the register whenever the

                            lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                            0

                            For the currently loaded byte the active pixel is always in the low order

                            nibble of the Pixel register Remember that in this mode each byte of data

                            contains two pixels The second pixel is moved into this active pixel position

                            by shifting the contents of the register right by four bits on the rising edge of

                            CLK

                            24

                            64 Colors mode

                            The next byte of data will be loaded into the register on the rising edge

                            of the external system clock signal (CLK) In this mode the read of pixel data

                            does not depend on the status of the hvcnt signal received from the

                            Synchronization Unit

                            93 The RGB register (RGBREG)

                            The RGB register is used to store the six bits that are required for

                            driving the red green and blue color guns of the target monitor When the

                            chosen color palette is either Black amp White or 16 Colors these six bits are

                            obtained by mapping the value of the active pixel to a predefined RGB code

                            When the chosen color palette is 64 Colors the actual pixel value is used

                            directly

                            The register RGBREG is reset to zero (000000) when the VGA

                            Controller receives an external reset signal (RST) This RGB code represents

                            black

                            The register is updated on each rising edge of the CLK signal and the

                            value loaded is dependent on the state of the line display enable signal en

                            When en is Low blanking is required and RGBREG is loaded with the code

                            for black (000000)

                            Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                            and a 4-bit pixel (16 Colors mode) into the required RGB color code

                            25

                            26

                            27

                            The RGB color code stored in the RGB register is output from the VGA

                            Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                            and B1)

                            The monitor itself expects analog signals as inputs to its electron gun control

                            circuits This is achieved by using 2-bit digital to analog converter circuitry

                            located on the Nano Board itself as shown in Figure 4

                            Figure 91 Digital to Analog RGB conversion

                            28

                            For each color the 2-bit digital signal from the VGA Controller can be

                            converted into 4 distinct analog levels These levels specify the intensity of

                            each of the three primary colors to use when displaying the pixel on the

                            monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                            brightness)

                            With each analog input being one of four possible levels the monitor

                            can display each pixel on the screen with one of 64 different color

                            permutations

                            Using Block RAM in Spartan-3E Generation

                            FPGAs

                            For applications requiring large on-chip memories Spartantrade-3

                            Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                            Using various configuration options Select RAM blocks create RAM ROM

                            FIFOs large look-up tables data width converterscircular buffers and shift

                            registers each supporting various data widths and depths This application note

                            describes the features and capabilities of block Select RAM and illustrates how

                            to specify the various options using the Xilinx CORE Generatortrade system or

                            via VHDL or Verilog instantiation

                            29

                            Table 2 Block RAM Available in Spartan-3E Devices

                            Devi

                            ce

                            R

                            A

                            M

                            Co

                            lu

                            m

                            ns

                            R

                            A

                            M

                            Bl

                            oc

                            ks

                            pe

                            r

                            C

                            ol

                            u

                            m

                            n

                            T

                            o

                            t

                            a

                            l

                            R

                            A

                            M

                            B

                            l

                            o

                            c

                            k

                            s

                            T

                            o

                            t

                            a

                            l

                            R

                            A

                            M

                            B

                            it

                            s

                            T

                            o

                            t

                            a

                            l

                            R

                            A

                            M

                            K

                            b

                            i

                            t

                            s

                            XC3

                            S100

                            E

                            1 4 4 7

                            3

                            7

                            2

                            8

                            7

                            2

                            XC3

                            S250

                            E

                            2 6 1

                            2

                            2

                            2

                            1

                            1

                            8

                            4

                            2

                            1

                            6

                            XC3

                            S500

                            E

                            2 10 2

                            0

                            3

                            6

                            8

                            6

                            4

                            3

                            6

                            0

                            30

                            0

                            XC3

                            S120

                            0E

                            2 14 2

                            8

                            5

                            1

                            6

                            0

                            9

                            6

                            5

                            0

                            4

                            XC3

                            S160

                            0E

                            2 18 3

                            6

                            6

                            6

                            3

                            5

                            5

                            2

                            6

                            4

                            8

                            Each block RAM contains 18432 bits of fast static RAM 16K bits of

                            which is allocated to data storage and in some memory configurations an

                            additional 2K bits allocated to parity or additional plus data bits Physically

                            the block RAM memory has two completely independent access ports labeled

                            Port A and Port B The structure is fully symmetrical and both ports are

                            interchangeable and both ports support data read and write operations Each

                            memory port is synchronous with its own clock clock enable and write

                            enable Read operations are also synchronous and require a clock edge and

                            clock enable Though physically a dual-port memory block RAM simulates

                            single-port memory in an application as shown in Figure 1 Furthermore each

                            block memory supports multiple configurations or aspect ratios Table 3

                            summarizes the essential SelectRAM features Cascade multiple block RAMs

                            to create deeper and wider memory organizations with a minimal timing

                            penalty incurred through specialized routing resources

                            31

                            32

                            10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                            Why (V) HDL

                            Interoperability

                            Technology independence

                            Design reuse

                            Several levels of abstraction

                            Readability

                            Standard language

                            Widely supported

                            What is VHDL

                            VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                            IC)

                            Design specification language

                            Design entry language

                            Design simulation language

                            Design documentation language

                            An alternative to schematics

                            101 Brief History

                            VHDL Was developed in the early 1980s for managing design problems that

                            involved large circuits and multiple teams of engineers

                            Funded by US Department of Defence

                            33

                            The first publicly available version was released in 1985

                            In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                            presented with a proposal to standardize the VHDL

                            In 1987 standardization =gt IEEE 1076-1987

                            An improved version of the language was released in 1994 =gt IEEE

                            standard1076-1993

                            102 Related Standards

                            IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                            impedance

                            Soon after IEEE 1076-1987 was released simulator companies began using

                            their own non-standard types =gt VHDL was becoming a nonstandard

                            IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                            definitions for a nine-valued data type std_logic

                            IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                            to actual hardware

                            Defines eg two numeric types signed and unsigned

                            VHDL Environment

                            Design Units

                            34

                            Segments of VHDL code that can be compiled separately and stored in a

                            library

                            Entities

                            A black box with interface definition

                            Defines the inputsoutputs of a component (define pins)

                            A way to represent modularity in VHDL

                            Similar to symbol in schematic

                            Entity declaration describes entity

                            Eg

                            entity Comparator is

                            port (A B in std_logic_vector(7 downto0)

                            EQ out std_logic)

                            end Comparator

                            Ports Provide channels of communication between the component and its

                            environment

                            Each port must have a name direction and a type

                            An entity may have NO port declaration

                            35

                            Port directions In A value of a port can be read inside the component but cannot be assigned

                            Multiple reads of port are allowed

                            Out Assignments can be made to a port but data from a port cannot be read

                            Multiple assignments are allowed

                            In out Bi-directional assignments can be made and data can be read Multiple

                            assignments are allowed

                            Buffer An out port with read capability May have at most one assignment

                            (are not recommended)

                            Architectures Every entity has at least one architecture

                            One entity can have several architectures

                            Architectures can describe design using

                            BehaviorndashStructurendashDataflow

                            Architectures can describe design on many levelsndashGate levelndashRTL (Register

                            Transfer Level)ndashBehavioral level

                            Configuration declaration links architecture to entity

                            Eg

                            Architecture Comparator1 of Comparator is

                            Begin

                            EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                            End Comparator1

                            Configurations Links entity declaration and architecture body together

                            Concept of default configuration is a bit messy in VHDL lsquo87

                            ndashLast architecture analyzed links to entity

                            Can be used to change simulation behavior without re-analyzing the VHDL

                            source

                            Complex configuration declarations are ignored in synthesis

                            36

                            Some entities can have eggate level architecture and behavioral

                            architecture

                            Are always optional

                            PackagesPackages contain information common to many design units

                            1 Package declaration

                            --constant declarations

                            ndashtype and subtype declarations

                            ndashfunction and procedure declarations

                            ndashglobal signal declarations

                            ndashfile declarations

                            ndashcomponent declarations

                            2 Package body

                            ndashis not necessary needed

                            ndashfunction bodies

                            ndashprocedure bodies

                            Packages are meant for encapsuling data which can be shared globally among

                            Several design units

                            Consists of declaration part and optional body part

                            Package declaration can contain

                            ndashtype and subtype declarations

                            ndashsubprograms

                            ndashconstants

                            Alias declarations

                            ndashglobal signal declarations

                            ndashfile declarations

                            ndashcomponent declarations

                            37

                            Package body consists of

                            ndashsubprogram declarations and bodies

                            ndashtype and subtype declarations

                            ndash deferred constants

                            ndash file declarations

                            LibrariesCollection of VHDL design units (database)

                            1 Packages

                            package declaration

                            package body

                            2 Entities (entity declaration)

                            3 Architectures (architecture body)

                            4 Configurations (configuration declarations)

                            Usually directory in UNIX file system

                            Can be also any other kind of database

                            Levels of Abstraction

                            VHDL supports many possible styles of design description which differ

                            primarily in how closely they relate to the HW

                            It is possible to describe a circuit in a number of ways

                            Structural-------

                            Dataflow ------- Higher level of abstraction

                            Behavioral -------

                            Structural VHDL description

                            Circuit is described in terms of its components

                            From a low-level description (eg transistor-level description) to a high level

                            description (eg block diagram)

                            For large circuits a low-level description quickly becomes impractical

                            103 Dataflow VHDL Description

                            38

                            Circuit is described in terms of how data moves through the system

                            In the dataflow style you describe how information flows between registers in

                            the system

                            The combinational logic is described at a relatively high level the placement

                            and operation of registers is specified quite precisely

                            The behavior of the system over the time is defined by registers

                            There are no build-in registers in VHDL-language

                            ndashEither lower level description

                            ndashor behavioral description of sequential elements is needed

                            The lower level register descriptions must be created or obtained

                            If there is no 3rd party models for registers =gt you must write the behavioral

                            description of registers

                            The behavioral description can be provided in the form of

                            subprograms(functions or procedures)

                            104 Behavioral VHDL Description

                            Circuit is described in terms of its operation over time

                            39

                            Representation might include eg state diagrams timing diagrams and

                            algorithmic descriptions

                            The concept of time may be expressed precisely using delays (eg A lt= B

                            after 10 ns)

                            If no actual delay is used order of sequential operations is defined

                            In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                            timing specifications

                            The actual timing results depend on implementation technology and efficiency

                            of synthesis tool

                            There are a few tools for behavioral synthesis

                            Concurrent Vs Sequential

                            Processes

                            Basic simulation concept in VHDL

                            VHDL description can always be broken up to interconnected processes

                            Quite similar to Unix process

                            40

                            Process keyword in VHDL

                            Process statement is concurrent statement

                            Statements inside process statements are sequential statements

                            Process must contain either sensitivity list or wait statement(s) but NOT both

                            Sensitivity list or wait statement(s) contains signals which wakes process up

                            General format

                            Process [(sensitivity_list)]

                            process_declarative_part

                            Begin

                            process_statements

                            [wait_statement]

                            End process

                            41

                            11 VGA CONTROLLER CODE

                            library IEEE

                            use IEEESTD_LOGIC_1164ALL

                            use IEEESTD_LOGIC_ARITHALL

                            use IEEESTD_LOGIC_UNSIGNEDALL

                            use IEEEnumeric_stdALL

                            ---- Uncomment the following library declaration if instantiating

                            ---- any Xilinx primitives in this code

                            --library UNISIM

                            --use UNISIMVComponentsall

                            entity vga_controller is port(clk50min std_logic

                            rstnin std_logic

                            hsyncout std_logic

                            vsyncout std_logic

                            redout std_logic_vector(0 downto 0)

                            greenout std_logic_vector(0 downto 0)

                            blueout std_logic_vector(0 downto 0))

                            end vga_controller

                            architecture Behavioral of vga_controller is

                            component memory

                            port (

                            addr IN std_logic_VECTOR(17 downto 0)

                            clk IN std_logic

                            dout OUT std_logic_VECTOR(0 downto 0)

                            en IN std_logic

                            sinit IN std_logic)

                            END component

                            42

                            component sync_unit port(

                            clkin std_logic

                            rstnin std_logic

                            hsyncout std_logic

                            ---horizontal synch pulse(row)

                            vsyncout std_logic

                            --vertical sync pulse(frame) 111110100000000000

                            v_en_regout std_logic

                            h_enout std_logic

                            v_enout std_logic

                            line_countinout integer

                            c_hsinout integer)

                            end component

                            component pixelgen port(

                            clkin std_logic

                            rstnin std_logic

                            hsyncin std_logic

                            vsyncin std_logic

                            h_enin std_logic

                            v_enin std_logic

                            v_en_reg in std_logic

                            data_inin std_logic_vector(0 downto 0)

                            line_countin integer

                            c_hsin integer

                            red out std_logic_vector(0 downto 0)

                            greenout std_logic_vector(0 downto 0)

                            blueout std_logic_vector(0 downto 0)

                            addrout std_logic_vector(17 downto 0)

                            rdout std_logic)

                            end component

                            43

                            signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                            signal line_count_sc_hs_saddrinteger

                            signal data_sstd_logic_vector(0 downto 0)

                            signal addr_sstd_logic_vector(17 downto 0)

                            begin

                            --addr_s lt= std_logic_vector(to_unsigned(addr18))

                            process(clk50mrstn)

                            begin

                            if(rstn = 0)then

                            clk lt= 0

                            elsif(clk50m = 1 and clk50mevent)then

                            clk lt= not clk

                            end if

                            --end if

                            end process

                            hsync lt= hsync_s

                            vsync lt= vsync_s

                            sync sync_unit port map(

                            clk =gt clk

                            rstn =gt rstn

                            hsync =gt hsync_s

                            vsync =gt vsync_s

                            v_en_reg =gt v_en_reg_s

                            line_count =gt line_count_s

                            h_en =gt h_en_s

                            v_en =gt v_en_s

                            c_hs =gt c_hs_s

                            )

                            44

                            pixel pixelgen port map(

                            clk =gt clk

                            rstn =gt rstn

                            hsync =gt hsync_s

                            vsync =gt vsync_s

                            h_en =gt h_en_s

                            v_en =gt v_en_s

                            v_en_reg =gt v_en_reg_s

                            data_in =gt data_s

                            line_count=gt line_count_s

                            c_hs =gt c_hs_s

                            red =gt red

                            green =gt green

                            blue =gt blue

                            addr =gt addr_s

                            rd =gt rd_s

                            )

                            mem12 memory port map(

                            addr =gt addr_s

                            clk =gt clk

                            dout =gt data_s

                            en =gt rd_s

                            sinit=gt rstn

                            )

                            end Behavioral

                            45

                            SYNCHRONIZATION BLOCK CODE

                            library IEEE

                            use IEEESTD_LOGIC_1164ALL

                            use IEEESTD_LOGIC_ARITHALL

                            use IEEESTD_LOGIC_UNSIGNEDALL

                            entity sync_unit is port(

                            clkin std_logic

                            rstnin std_logic

                            hsyncout std_logic ---horizontal synch pulse(row)

                            vsyncout std_logic --vertical sync

                            pulse(frame) 111110100000000000

                            v_en_regout std_logic

                            h_enout std_logic

                            v_enout std_logic

                            line_countout integer

                            c_hsout integer )

                            end sync_unit

                            architecture arch_sync_unit of sync_unit is

                            signal h_en_sv_en_sstd_logic

                            signal line_count_sc_hs_sinteger

                            begin

                            c_hs lt= c_hs_s

                            line_count lt= line_count_s

                            h_en lt= h_en_s

                            v_en lt= v_en_s

                            -- hsync counter

                            process(clkrstn)

                            begin

                            if(rstn = 0)then

                            46

                            c_hs_s lt= 0

                            else

                            if(clk = 1 and clkevent)then

                            if(c_hs_s=793)then

                            c_hs_s lt= 0

                            else

                            c_hs_s lt= c_hs_s +1

                            end if

                            end if

                            end if

                            end process

                            --vsync counter

                            --vertical line counter

                            process(h_en_srstn)

                            begin

                            if(rstn = 0)then

                            line_count_s lt= 0

                            elsif(h_en_s = 1 and h_en_sevent)then

                            line_count_s lt= line_count_s + 1

                            if (line_count_s=515)then

                            line_count_s lt= 0

                            end if

                            end if

                            --end if

                            end process

                            --hysnc pulse

                            process(clkrstn)

                            begin

                            if(rstn = 0)then

                            47

                            hsync lt= 1

                            elsif(clk = 1 and clkevent)then

                            if(c_hs_s lt=95)then

                            hsync lt= 0

                            else

                            hsync lt= 1

                            end if

                            end if

                            --end if

                            end process

                            process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                            this period)

                            begin

                            if(rstn = 0)then

                            h_en_s lt= 0

                            elsif(clk = 1 and clkevent)then

                            if((c_hs_s gt=143) and (c_hs_slt=778))then

                            h_en_s lt= 1

                            else

                            h_en_s lt= 0

                            end if

                            end if

                            --end if

                            end process

                            ---vysnc pulse

                            process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                            begin

                            if(rstn = 0)then

                            48

                            v_en_s lt= 0

                            elsif(clk = 1 and clkevent)then

                            if((line_count_sgt=35 and line_count_slt=515))then

                            v_en_s lt= 1

                            else

                            v_en_s lt= 0

                            end if

                            end if

                            --end if

                            end process

                            process(clkrstn) --Vertical Sync Front porch Back Porch

                            begin

                            if(rstn = 0)then

                            vsync lt= 1

                            elsif(clk = 1 and clkevent)then

                            if(line_count_s lt2)then

                            vsync lt= 0

                            else

                            vsync lt= 1

                            end if

                            end if

                            --end if

                            end process

                            process(clk)

                            begin

                            if(clk = 1 and clkevent)then

                            v_en_reglt=v_en_s

                            end if

                            end process

                            end arch_sync_unit

                            49

                            PIXEL GENERATOR

                            library IEEE

                            use IEEESTD_LOGIC_1164ALL

                            use IEEESTD_LOGIC_ARITHALL

                            use IEEESTD_LOGIC_UNSIGNEDALL

                            entity pixelgen is port(

                            clkin std_logic

                            rstnin std_logic

                            hsyncin std_logic

                            vsyncin std_logic

                            h_enin std_logic

                            v_enin std_logic

                            v_en_reg in std_logic

                            data_inin std_logic_vector(0 downto 0)

                            line_countin integer

                            c_hsin integer

                            red out std_logic_vector(0 downto 0)

                            greenout std_logic_vector(0 downto 0)

                            blueout std_logic_vector(0 downto 0)

                            addrout std_logic_vector(17 downto 0)

                            rdout std_logic

                            )

                            end pixelgen

                            architecture behv of pixelgen is

                            signal addr_sstd_logic_vector(17 downto 0)

                            begin

                            addr lt= addr_s

                            process(clkrstn)

                            begin

                            50

                            if(rstn = 0)then

                            rd lt= 0

                            red lt= 0

                            green lt= 0

                            blue lt= 0

                            addr_s lt= x0000 amp 00

                            elsif (clk = 1 and clkevent)then

                            if((v_en=1) and (v_en_reg=0))then

                            addr_slt=(others =gt0)

                            end if

                            if(v_en=1 and h_en=1)then

                            if(line_countlt=514)then

                            if(c_hs lt= 655)then

                            rd lt= 1

                            red lt= data_in

                            green lt= data_in

                            blue lt= data_in

                            addr_s lt=addr_s +000000000000000001

                            else

                            rd lt= 0

                            red lt= 0

                            green lt= 1

                            blue lt= 0

                            end if

                            else

                            rd lt= 0

                            red lt= 1

                            green lt= 1

                            blue lt= 1

                            51

                            end if

                            if(addr_s=111011111111111111)then

                            addr_slt= (others =gt0)

                            end if

                            else

                            red lt= 0

                            green lt= 0

                            blue lt= 0

                            end if

                            end if

                            -- end if

                            --end if

                            end process

                            end behv

                            52

                            12 FPGA ARCHITECTURE

                            A field-programmable gate array (FPGA) is a semiconductor device that can

                            be configured by the customer or designer after manufacturingmdashhence the

                            name field-programmable FPGAs are programmed using a logic circuit

                            diagram or a source code in a hardware description language (HDL) to specify

                            how the chip will work They can be used to implement any logical function

                            that an application-specific integrated circuit (ASIC) could perform but the

                            ability to update the functionality after shipping offers advantages for many

                            applications

                            FPGAs contain programmable logic components called logic blocks and a

                            hierarchy of reconfigurable interconnects that allow the blocks to be wired

                            togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                            can be configured to perform complex combinational functions or merely

                            simple logic gates like AND and XOR In most FPGAs the logic blocks also

                            include memory elements which may be simple flip-flops or more complete

                            blocks of memory

                            History The FPGA industry sprouted from programmable read only

                            memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                            both had the option of being programmed in batches in a factory or in the field

                            (field programmable) however programmable logic was hard-wired between

                            logic gates

                            Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                            first commercially viable field programmable gate array in 1985 ndash the

                            XC2064 The XC2064 had programmable gates and programmable

                            interconnects between gates the beginnings of a new technology and market

                            The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                            input lookup tables (LUTs) More than 20 years later Freeman was entered

                            into the National Inventors Hall of Fame for his invention

                            53

                            Some of the industryrsquos foundational concepts and technologies for

                            programmable logic arrays gates and logic blocks are founded in patents

                            awarded to David W Page and LuVerne R Peterson in 1985

                            In the late 1980s the Naval Surface Warfare Department funded an experiment

                            proposed by Steve Casselman to develop a computer that would implement

                            600000 reprogrammable gates Casselman was successful and the system was

                            awarded a patent in 1992

                            Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                            1990s when competitors sprouted up eroding significant market-share By

                            1993 Actel was serving about 18 percent of the market

                            The 1990s were an explosive period of time for FPGAs both in sophistication

                            and the volume of production In the early 1990s FPGAs were primarily used

                            in telecommunications and networking By the end of the decade FPGAs

                            found their way into consumer automotive and industrial applications

                            FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                            algorithm technology and FPGAs to create a sound recognition device

                            Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                            to decide the configuration needed to accomplish a sound recognition task

                            121 Modern developments

                            A recent trend has been to take the coarse-grained architectural

                            approach a step further by combining the logic blocks and interconnects of

                            traditional FPGAs with embedded microprocessors and related peripherals to

                            form a complete system on a programmable chip This work mirrors the

                            architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                            Group which combined a reconfigurable CPU architecture on a single chip

                            called the SB24 That work was done in 1982 Examples of such hybrid

                            technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                            which include one or more PowerPC processors embedded within the FPGAs

                            54

                            logic fabric The Atmel FPSLIC is another such device which uses an AVR

                            processor in combination with Atmels programmable logic architecture

                            An alternate approach to using hard-macro processors is to make use of soft

                            processor cores that are implemented within the FPGA logic (See Soft

                            processors below)

                            As previously mentioned many modern FPGAs have the ability to be

                            reprogrammed at run time and this is leading to the idea of reconfigurable

                            computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                            suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                            example of a reconfigurable soft processor implemented on FPGAs However

                            it does not support dynamic reconfiguration at runtime but instead adapts itself

                            to a specific program

                            Additionally new non-FPGA architectures are beginning to emerge

                            Software-configurable microprocessors such as the Stretch S5000 adopt a

                            hybrid approach by providing an array of processor cores and FPGA-like

                            programmable cores on the same chip

                            Gates

                            1987 9000 gates Xilinx

                            1992 600000 Naval Surface Warfare Department

                            Early 2000s Millions

                            Market size

                            1985 First commercial FPGA technology invented by Xilinx

                            1987 $14 million

                            ~1993 gt$385 million

                            2005 $19 billion

                            2010 estimates $275 billion

                            55

                            122 FPGA Comparisons

                            Historically FPGAs have been slower less energy efficient and generally

                            achieved less functionality than their fixed ASIC counterparts A combination

                            of volume fabrication improvements research and development and the IO

                            capabilities of new supercomputers have largely closed the performance gap

                            between ASICs and FPGAs

                            Advantages include a shorter time to market ability to re-program in the field

                            to fix bugs and lower non-recurring engineering costs Vendors can also take a

                            middle road by developing their hardware on ordinary FPGAs but

                            manufacture their final version so it can no longer be modified after the design

                            has been committed

                            Xilinx claims that several market and technology dynamics are changing the

                            ASICFPGA paradigm

                            IC costs are rising aggressively

                            ASIC complexity has bolstered development time and costs

                            RampD resources and headcount is decreasing

                            Revenue losses for slow time-to-market are increasing

                            Financial constraints in a poor economy are driving low-cost technologies

                            These trends make FPGAs a better alternative than ASICs for a growing

                            number of higher-volume applications than they have been historically used

                            for which the company blames for the growing number of FPGA design starts

                            (see History)

                            The primary differences between CPLDs and FPGAs are architectural A

                            CPLD has a somewhat restrictive structure consisting of one or more

                            programmable sum-of-products logic arrays feeding a relatively small number

                            of clocked registers The result of this is less flexibility with the advantage of

                            more predictable timing delays and a higher logic-to-interconnect ratio The

                            FPGA architectures on the other hand are dominated by interconnect This

                            56

                            makes them far more flexible (in terms of the range of designs that are

                            practical for implementation within them) but also far more complex to design

                            for

                            Another notable difference between CPLDs and FPGAs is the presence in most

                            FPGAs of higher-level embedded functions (such as adders and multipliers)

                            and embedded memories as well as to have logic blocks implement decoders

                            or mathematical functions

                            Some FPGAs have the capability of partial re-configuration that lets one

                            portion of the device be re-programmed while other portions continue running

                            123 Applications

                            Applications of FPGAs include digital signal processing software-defined

                            radio aerospace and defense systems ASIC prototyping medical imaging

                            computer vision speech recognition cryptography bioinformatics computer

                            hardware emulation radio astronomy and a growing range of other areas

                            FPGAs originally began as competitors to CPLDs and competed in a similar

                            space that of glue logic for PCBs As their size capabilities and speed

                            increased they began to take over larger and larger functions to the state where

                            some are now marketed as full systems on chips (SoC) Particularly with the

                            introduction of dedicated multipliers into FPGA architectures in the late 1990s

                            applications which had traditionally been the sole reserve of DSPs began to

                            incorporate FPGAs instead

                            FPGAs especially find applications in any area or algorithm that can make use

                            of the massive parallelism offered by their architecture One such area is code

                            breaking in particular brute-force attack of cryptographic algorithms

                            FPGAs are increasingly used in conventional high performance computing

                            applications where computational kernels such as FFT or Convolution are

                            performed on the FPGA instead of a microprocessor

                            57

                            The inherent parallelism of the logic resources on an FPGA allows for

                            considerable computational throughput even at a low MHz clock rates The

                            flexibility of the FPGA allows for even higher performance by trading off

                            precision and range in the number format for an increased number of parallel

                            arithmetic units This has driven a new type of processing called

                            reconfigurable computing where time intensive tasks are offloaded from

                            software to FPGAs

                            The adoption of FPGAs in high performance computing is currently limited by

                            the complexity of FPGA design compared to conventional software and the

                            extremely long turn-around times of current design tools where 4-8 hours wait

                            is necessary after even minor changes to the source code

                            Traditionally FPGAs have been reserved for specific vertical applications

                            where the volume of production is small For these low-volume applications

                            the premium that companies pay in hardware costs per unit for a

                            programmable chip is more affordable than the development resources spent

                            on creating an ASIC for a low-volume application Today new cost and

                            performance dynamics have broadened the range of viable applications

                            Architecture

                            The most common FPGA architecture consists of an array of configurable

                            logic blocks (CLBs) IO pads and routing channels Generally all the routing

                            channels have the same width (number of wires) Multiple IO pads may fit

                            into the height of one row or the width of one column in the array

                            An application circuit must be mapped into an FPGA with adequate resources

                            While the number of CLBs and IOs required is easily determined from the

                            design the number of routing tracks needed may vary considerably even

                            among designs with the same amount of logic (For example a crossbar switch

                            requires much more routing than a systolic array with the same gate count)

                            Since unused routing tracks increase the cost (and decrease the performance)

                            of the part without providing any benefit FPGA manufacturers try to provide

                            58

                            just enough tracks so that most designs that will fit in terms of LUTs and IOs

                            can be routed This is determined by estimates such as those derived from

                            Rents rule or by experiments with existing designs

                            The FPGA is an array or island-style FPGA It consists of an array of logic

                            blocks and routing channels Two IO pads fit into the height of one row or the

                            width of one column as shown below All the routing channels have the same

                            width (number of wires)

                            121 FPGA structure

                            A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                            flip-flop as shown below In recent years manufacturers have started moving

                            to 6-input LUTs in their high performance parts claiming increased

                            performance

                            Typical logic block

                            59

                            There is only one output which can be either the registered or the unregistered

                            LUT output The logic block has four inputs for the LUT and a clock input

                            Since clock signals (and often other high-fanout signals) are normally routed

                            via special-purpose dedicated routing networks in commercial FPGAs they

                            and other signals are separately managed

                            For this example architecture the locations of the FPGA logic block pins are

                            shown below

                            Logic Block Pin Locations

                            Each input is accessible from one side of the logic block while the output pin

                            can connect to routing wires in both the channel to the right and the channel

                            below the logic block

                            Each logic block output pin can connect to any of the wiring segments in the

                            channels adjacent to it

                            Similarly an IO pad can connect to any one of the wiring segments in the

                            channel adjacent to it For example an IO pad at the top of the chip can

                            connect to any of the W wires (where W is the channel width) in the horizontal

                            channel immediately below it

                            Generally the FPGA routing is unsegmented That is each wiring segment

                            spans only one logic block before it terminates in a switch box By turning on

                            60

                            some of the programmable switches within a switch box longer paths can be

                            constructed For higher speed interconnect some FPGA architectures use

                            longer routing lines that span multiple logic blocks

                            Whenever a vertical and a horizontal channel intersect there is a switch box In

                            this architecture when a wire enters a switch box there are three

                            programmable switches that allow it to connect to three other wires in adjacent

                            channel segments The pattern or topology of switches used in this

                            architecture is the planar or domain-based switch box topology In this switch

                            box topology a wire in track number one connects only to wires in track

                            number one in adjacent channel segments wires in track number 2 connect

                            only to other wires in track number 2 and so on The figure below illustrates

                            the connections in a switch box

                            Switch box topology

                            Modern FPGA families expand upon the above capabilities to include higher

                            level functionality fixed into the silicon Having these common functions

                            embedded into the silicon reduces the area required and gives those functions

                            increased speed compared to building them from primitives Examples of these

                            include multipliers generic DSP blocks embedded processors high speed IO

                            logic and embedded memories

                            61

                            FPGAs are also widely used for systems validation including pre-silicon

                            validation post-silicon validation and firmware development This allows chip

                            companies to validate their design before the chip is produced in the factory

                            reducing the time to market

                            124 FPGA Design and Programming

                            To define the behavior of the FPGA the user provides a hardware description

                            language (HDL) or a schematic design The HDL form might be easier to work

                            with when handling large structures because its possible to just specify them

                            numerically rather than having to draw every piece by hand On the other hand

                            schematic entry can allow for easier visualisation of a design

                            Then using an electronic design automation tool a technology-mapped netlist

                            is generated The netlist can then be fitted to the actual FPGA architecture

                            using a process called place-and-route usually performed by the FPGA

                            companys proprietary place-and-route software The user will validate the

                            map place and route results via timing analysis simulation and other

                            verification methodologies Once the design and validation process is

                            complete the binary file generated (also using the FPGA companys

                            proprietary software) is used to (re)configure the FPGA

                            Going from schematicHDL source files to actual configuration The source

                            files are fed to a software suite from the FPGACPLD vendor that through

                            different steps will produce a file This file is then transferred to the

                            FPGACPLD via a serial interface (JTAG) or to an external memory device

                            like an EEPROM

                            The most common HDLs are VHDL and Verilog although in an attempt to

                            reduce the complexity of designing in HDLs which have been compared to the

                            equivalent of assembly languages there are moves to raise the abstraction level

                            through the introduction of alternative languages

                            62

                            To simplify the design of complex systems in FPGAs there exist libraries of

                            predefined complex functions and circuits that have been tested and optimized

                            to speed up the design process These predefined circuits are commonly called

                            IP cores and are available from FPGA vendors and third-party IP suppliers

                            (rarely free and typically released under proprietary licenses) Other

                            predefined circuits are available from developer communities such as Open

                            Cores (typically free and released under the GPL BSD or similar license) and

                            other sources

                            In a typical design flow an FPGA application developer will simulate the

                            design at multiple stages throughout the design process Initially the RTL

                            description in VHDL or Verilog is simulated by creating test benches to

                            simulate the system and observe results Then after the synthesis engine has

                            mapped the design to a netlist the netlist is translated to a gate level

                            description where simulation is repeated to confirm the synthesis proceeded

                            without errors Finally the design is laid out in the FPGA at which point

                            propagation delays can be added and the simulation run again with these

                            values back-annotated onto the netlist

                            Basic Process Technology Types

                            SRAM - based on static memory technology In-system programmable and re-

                            programmable Requires external boot devices CMOS

                            Antifuse - One-time programmable CMOS

                            EPROM - Erasable Programmable Read-Only Memory technology Usually

                            one-time programmable in production because of plastic packaging

                            Windowed devices can be erased with ultraviolet (UV) light CMOS

                            EEPROM - Electrically Erasable Programmable Read-Only Memory

                            technology Can be erased even in plastic packages Some but not all

                            EEPROM devices can be in-system programmed CMOS

                            Flash - Flash-erase EPROM technology Can be erased even in plastic

                            packages Some but not all flash devices can be in-system programmed

                            63

                            Usually a flash cell is smaller than an equivalent EEPROM cell and is

                            therefore less expensive to manufacture CMOS

                            Fuse - One-time programmable Bipolar

                            Major Manufacturers

                            Xilinx and Altera are the current FPGA market leaders and long-time industry

                            rivals Together they control over 80 percent of the market with Xilinx alone

                            representing over 50 percent

                            Xilinx also provides free Windows and Linux design software while Altera

                            provides free Windows tools the Solaris and Linux tools are only available via

                            a rental scheme

                            Other competitors include Lattice Semiconductor (flash SRAM) Actel

                            (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                            Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                            focused CSSP no general purpose FPGAs)

                            125 FPGA prototype

                            FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                            prototyping is the method to prototype SoC and ASIC design on FPGA for

                            hardware verification and early software development

                            Main stream verification methods for hardware design and early software and

                            firmware co-design has become mainstream Prototyping SoC and ASIC

                            design on FPGA has become a good method to do this

                            64

                            Reasons why Prototyping is important

                            1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                            functionally correct This is compared to designers only relying on software

                            simulations to verify that their hardware design is sound Simulation speed and

                            modeling accuracy limitations hinder this development

                            2 Due to time constrains many projects cannot wait until the silicon is back from

                            the foundry to start on software tests FPGA prototyping allows for much more

                            time in area of software development and testing at the software-hardware

                            integration stage This allows many unforeseen software bugs that appear due

                            to todays array of operating systems applications and hardware

                            3 Prototyping also allows the developer to ensure that all IP technologies on his

                            system work well together off the simulation stage and in actual form

                            4 Prototyping has the added advantage as demo platforms to SoC clients

                            bringing in interest early This speeds up the overall development cycle and

                            allows for more enhancement or improvement to the chip features as it would

                            otherwise have been

                            65

                            13 SIMULATION RESULTS

                            66

                            GATE LEVEL

                            FigVGA CONTROLLER

                            67

                            GATE LEVEL

                            Fig SYNC UNIT

                            68

                            TECHNOLOGY SCHEMATIC

                            69

                            14 APPLICATIONS

                            Motion Tracking Systems

                            Image processing systems

                            Displaying Systems

                            141 CONCLUTION

                            Hence by implementing FPGA taking a processor-generated picture

                            (pixilated) from memory space the Controller provides digital RGB values for

                            each pixel as well as horizontal and vertical synchronization signals in order

                            to correctly display the picture on a connected monitor

                            70

                            REFERENCES

                            1 httpenwikipediaorgwikiVideo_Graphics_Array

                            2 A VGA display controller by Eduardo Sanchez

                            3 HinnerrdquoVGA Interface and video signal documentsrdquo

                            71

                            • Pixel
                              • 6 SIGNALS IN VGA CONTROLLER
                              • Gates
                              • Market size
                              • 122 FPGA Comparisons

                              =gt For giving information for 1 pixel we use 1 clock (for controlling all

                              operations)

                              1 clock = 1 pixel

                              1 clock cycle period = 40ns

                              Clock frequency = 1time period =140ns = 25 mhz

                              1 clock frequency = 25 mhz

                              So if we use 25mhz clock and display 1 pixel information for each and enery

                              clock then we use can attain min 60hz refreshing rate

                              do not use the clock frequency less than 25mhz

                              Ex

                              15MHz 25MHz

                              30MHz

                              Time period = more effectively utilized frequency

                              Less

                              No of framessec = less

                              More

                              Decreases

                              Increases

                              (It is unable to display 60framessec)

                              Signals used for image display-

                              R (Red)

                              15

                              G (Green)

                              B (Blue)

                              HSYNC (Horizontal Synchronous signal)

                              VSYNC (Vertical Synchronous signal)

                              All the above signals are used to control the image data to be correctly

                              displayed on the VGA display monitor

                              72 Timings for front port sync pulse back port-

                              Horizontal sync Vertical sync

                              Front port 094micros Front port 035ms

                              Sync pulse 377 micros Sync pulse 0063ms

                              Back port 189 micros Back port 102ms

                              16

                              8 FUNCTIONAL DESCRIPTION81 Symbol

                              VGA Controller Symbol

                              Pin description

                              The pin description is shown in the below table

                              Table VGA Pin description

                              Name Type Polarity Bus size Description

                              Global Control Signals

                              CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                              17

                              640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                              RST I High Global system reset

                              VGA Input Settings Signals

                              RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                              CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                              DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                              DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                              Data Memory Control Signals

                              RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                              18

                              the internal line enable signal en generated by the Synchronization Unit of the Controller

                              DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                              ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                              VGA Monitor Control Signals

                              HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                              VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                              19

                              VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                              R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                              G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                              S

                              B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                              20

                              9 VGA SIGNAL TIMING

                              Figure 3 summarizes the signal timing involved in sending a line of

                              pixels and a frame of lines The actual time values differ according to the

                              resolution selected (640x480 or 800x600) the processor-defined values for

                              DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                              (CLK)

                              21

                              91 VGA PIXEL UNIT

                              The Pixel Unit provides access to the pixilated image stored in external

                              picture memory reading in data a byte at a time and formatting each pixel to

                              be displayed For each pixel the 6-bit RGB value required for the control of

                              the monitorrsquos three primary color electron guns is generated so that the pixel

                              is displayed on the screen with the correct color

                              Accessing the stored image

                              The image to be displayed on the monitor screen is written by the host

                              microcontroller into external memory space (RAM) This memory space can

                              be located anywhere (eg using a block of RAM within the FPGA design or

                              using the SRAM on the NanoBoard itself)

                              Pixel data is stored in the memory space in bytes The number of pixels in a

                              byte depends on which color palette is being used for the image (selected by

                              the CMOD input)

                              bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                              bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                              bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                              (DATA[76] are not used)

                              22

                              The size of memory required to store a picture is determined by the total

                              number of viewable pixels in a line (determined by DISPSIZE_H) the total

                              number of viewable lines in a frame (determined by DISPSIZE_V) and the

                              number of pixels stored in each byte in memory space

                              Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                              The address in RAM where the next pixel is stored is determined using

                              an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                              reflects the current contents of the MEMADDR register The exact addressing

                              is described below

                              92 Black amp White mode

                              The picture memory address ndash the byte of data containing the next 8 pixels

                              ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                              three

                              ADDR_PIXEL = 00 amp hvcnt[183]

                              16 Colors mode

                              The picture memory address ndash the byte of data containing the next 2 pixels

                              ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                              one

                              ADDR_PIXEL = 0 amp hvcnt[181]

                              64 Colors mode

                              The picture memory address ndash the byte of data containing the next pixel ndash is

                              determined by using the full value of hvcnt

                              ADDR_PIXEL = hvcnt[180]

                              The Pixel register (PIXREG)

                              The Pixel register is used to receive the byte of pixel data read from the

                              current address in memory space The register PIXREG is reset to zero when

                              the VGA Controller receives an external reset signal (RST)

                              23

                              The register is updated on each rising edge of the CLK signal Data can be

                              read from the memory space as long as the RD signal is active (High) The RD

                              signal is itself controlled by the external line display enable signal enex This

                              internally generated signal is defined as follows

                              If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                              (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                              Else

                              enex = 1

                              When enex is Low read access from memory is disabled (RD = 0)

                              The point at which data is loaded from memory into PIXREG depends on the

                              particular color palette that is chosen ndash Black amp White 16 Colors or 64

                              Colors

                              Black amp White mode

                              The next byte of data will be loaded into the register whenever the

                              lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                              are ldquo000rdquo

                              For the currently loaded byte the active pixel is always in the lowest bit

                              position of the Pixel register Each pixel in the data byte is moved into this

                              active pixel position by shifting the contents of the register right by one bit on

                              each rising edge of CLK

                              16 Colors mode

                              The next byte of data will be loaded into the register whenever the

                              lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                              0

                              For the currently loaded byte the active pixel is always in the low order

                              nibble of the Pixel register Remember that in this mode each byte of data

                              contains two pixels The second pixel is moved into this active pixel position

                              by shifting the contents of the register right by four bits on the rising edge of

                              CLK

                              24

                              64 Colors mode

                              The next byte of data will be loaded into the register on the rising edge

                              of the external system clock signal (CLK) In this mode the read of pixel data

                              does not depend on the status of the hvcnt signal received from the

                              Synchronization Unit

                              93 The RGB register (RGBREG)

                              The RGB register is used to store the six bits that are required for

                              driving the red green and blue color guns of the target monitor When the

                              chosen color palette is either Black amp White or 16 Colors these six bits are

                              obtained by mapping the value of the active pixel to a predefined RGB code

                              When the chosen color palette is 64 Colors the actual pixel value is used

                              directly

                              The register RGBREG is reset to zero (000000) when the VGA

                              Controller receives an external reset signal (RST) This RGB code represents

                              black

                              The register is updated on each rising edge of the CLK signal and the

                              value loaded is dependent on the state of the line display enable signal en

                              When en is Low blanking is required and RGBREG is loaded with the code

                              for black (000000)

                              Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                              and a 4-bit pixel (16 Colors mode) into the required RGB color code

                              25

                              26

                              27

                              The RGB color code stored in the RGB register is output from the VGA

                              Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                              and B1)

                              The monitor itself expects analog signals as inputs to its electron gun control

                              circuits This is achieved by using 2-bit digital to analog converter circuitry

                              located on the Nano Board itself as shown in Figure 4

                              Figure 91 Digital to Analog RGB conversion

                              28

                              For each color the 2-bit digital signal from the VGA Controller can be

                              converted into 4 distinct analog levels These levels specify the intensity of

                              each of the three primary colors to use when displaying the pixel on the

                              monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                              brightness)

                              With each analog input being one of four possible levels the monitor

                              can display each pixel on the screen with one of 64 different color

                              permutations

                              Using Block RAM in Spartan-3E Generation

                              FPGAs

                              For applications requiring large on-chip memories Spartantrade-3

                              Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                              Using various configuration options Select RAM blocks create RAM ROM

                              FIFOs large look-up tables data width converterscircular buffers and shift

                              registers each supporting various data widths and depths This application note

                              describes the features and capabilities of block Select RAM and illustrates how

                              to specify the various options using the Xilinx CORE Generatortrade system or

                              via VHDL or Verilog instantiation

                              29

                              Table 2 Block RAM Available in Spartan-3E Devices

                              Devi

                              ce

                              R

                              A

                              M

                              Co

                              lu

                              m

                              ns

                              R

                              A

                              M

                              Bl

                              oc

                              ks

                              pe

                              r

                              C

                              ol

                              u

                              m

                              n

                              T

                              o

                              t

                              a

                              l

                              R

                              A

                              M

                              B

                              l

                              o

                              c

                              k

                              s

                              T

                              o

                              t

                              a

                              l

                              R

                              A

                              M

                              B

                              it

                              s

                              T

                              o

                              t

                              a

                              l

                              R

                              A

                              M

                              K

                              b

                              i

                              t

                              s

                              XC3

                              S100

                              E

                              1 4 4 7

                              3

                              7

                              2

                              8

                              7

                              2

                              XC3

                              S250

                              E

                              2 6 1

                              2

                              2

                              2

                              1

                              1

                              8

                              4

                              2

                              1

                              6

                              XC3

                              S500

                              E

                              2 10 2

                              0

                              3

                              6

                              8

                              6

                              4

                              3

                              6

                              0

                              30

                              0

                              XC3

                              S120

                              0E

                              2 14 2

                              8

                              5

                              1

                              6

                              0

                              9

                              6

                              5

                              0

                              4

                              XC3

                              S160

                              0E

                              2 18 3

                              6

                              6

                              6

                              3

                              5

                              5

                              2

                              6

                              4

                              8

                              Each block RAM contains 18432 bits of fast static RAM 16K bits of

                              which is allocated to data storage and in some memory configurations an

                              additional 2K bits allocated to parity or additional plus data bits Physically

                              the block RAM memory has two completely independent access ports labeled

                              Port A and Port B The structure is fully symmetrical and both ports are

                              interchangeable and both ports support data read and write operations Each

                              memory port is synchronous with its own clock clock enable and write

                              enable Read operations are also synchronous and require a clock edge and

                              clock enable Though physically a dual-port memory block RAM simulates

                              single-port memory in an application as shown in Figure 1 Furthermore each

                              block memory supports multiple configurations or aspect ratios Table 3

                              summarizes the essential SelectRAM features Cascade multiple block RAMs

                              to create deeper and wider memory organizations with a minimal timing

                              penalty incurred through specialized routing resources

                              31

                              32

                              10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                              Why (V) HDL

                              Interoperability

                              Technology independence

                              Design reuse

                              Several levels of abstraction

                              Readability

                              Standard language

                              Widely supported

                              What is VHDL

                              VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                              IC)

                              Design specification language

                              Design entry language

                              Design simulation language

                              Design documentation language

                              An alternative to schematics

                              101 Brief History

                              VHDL Was developed in the early 1980s for managing design problems that

                              involved large circuits and multiple teams of engineers

                              Funded by US Department of Defence

                              33

                              The first publicly available version was released in 1985

                              In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                              presented with a proposal to standardize the VHDL

                              In 1987 standardization =gt IEEE 1076-1987

                              An improved version of the language was released in 1994 =gt IEEE

                              standard1076-1993

                              102 Related Standards

                              IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                              impedance

                              Soon after IEEE 1076-1987 was released simulator companies began using

                              their own non-standard types =gt VHDL was becoming a nonstandard

                              IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                              definitions for a nine-valued data type std_logic

                              IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                              to actual hardware

                              Defines eg two numeric types signed and unsigned

                              VHDL Environment

                              Design Units

                              34

                              Segments of VHDL code that can be compiled separately and stored in a

                              library

                              Entities

                              A black box with interface definition

                              Defines the inputsoutputs of a component (define pins)

                              A way to represent modularity in VHDL

                              Similar to symbol in schematic

                              Entity declaration describes entity

                              Eg

                              entity Comparator is

                              port (A B in std_logic_vector(7 downto0)

                              EQ out std_logic)

                              end Comparator

                              Ports Provide channels of communication between the component and its

                              environment

                              Each port must have a name direction and a type

                              An entity may have NO port declaration

                              35

                              Port directions In A value of a port can be read inside the component but cannot be assigned

                              Multiple reads of port are allowed

                              Out Assignments can be made to a port but data from a port cannot be read

                              Multiple assignments are allowed

                              In out Bi-directional assignments can be made and data can be read Multiple

                              assignments are allowed

                              Buffer An out port with read capability May have at most one assignment

                              (are not recommended)

                              Architectures Every entity has at least one architecture

                              One entity can have several architectures

                              Architectures can describe design using

                              BehaviorndashStructurendashDataflow

                              Architectures can describe design on many levelsndashGate levelndashRTL (Register

                              Transfer Level)ndashBehavioral level

                              Configuration declaration links architecture to entity

                              Eg

                              Architecture Comparator1 of Comparator is

                              Begin

                              EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                              End Comparator1

                              Configurations Links entity declaration and architecture body together

                              Concept of default configuration is a bit messy in VHDL lsquo87

                              ndashLast architecture analyzed links to entity

                              Can be used to change simulation behavior without re-analyzing the VHDL

                              source

                              Complex configuration declarations are ignored in synthesis

                              36

                              Some entities can have eggate level architecture and behavioral

                              architecture

                              Are always optional

                              PackagesPackages contain information common to many design units

                              1 Package declaration

                              --constant declarations

                              ndashtype and subtype declarations

                              ndashfunction and procedure declarations

                              ndashglobal signal declarations

                              ndashfile declarations

                              ndashcomponent declarations

                              2 Package body

                              ndashis not necessary needed

                              ndashfunction bodies

                              ndashprocedure bodies

                              Packages are meant for encapsuling data which can be shared globally among

                              Several design units

                              Consists of declaration part and optional body part

                              Package declaration can contain

                              ndashtype and subtype declarations

                              ndashsubprograms

                              ndashconstants

                              Alias declarations

                              ndashglobal signal declarations

                              ndashfile declarations

                              ndashcomponent declarations

                              37

                              Package body consists of

                              ndashsubprogram declarations and bodies

                              ndashtype and subtype declarations

                              ndash deferred constants

                              ndash file declarations

                              LibrariesCollection of VHDL design units (database)

                              1 Packages

                              package declaration

                              package body

                              2 Entities (entity declaration)

                              3 Architectures (architecture body)

                              4 Configurations (configuration declarations)

                              Usually directory in UNIX file system

                              Can be also any other kind of database

                              Levels of Abstraction

                              VHDL supports many possible styles of design description which differ

                              primarily in how closely they relate to the HW

                              It is possible to describe a circuit in a number of ways

                              Structural-------

                              Dataflow ------- Higher level of abstraction

                              Behavioral -------

                              Structural VHDL description

                              Circuit is described in terms of its components

                              From a low-level description (eg transistor-level description) to a high level

                              description (eg block diagram)

                              For large circuits a low-level description quickly becomes impractical

                              103 Dataflow VHDL Description

                              38

                              Circuit is described in terms of how data moves through the system

                              In the dataflow style you describe how information flows between registers in

                              the system

                              The combinational logic is described at a relatively high level the placement

                              and operation of registers is specified quite precisely

                              The behavior of the system over the time is defined by registers

                              There are no build-in registers in VHDL-language

                              ndashEither lower level description

                              ndashor behavioral description of sequential elements is needed

                              The lower level register descriptions must be created or obtained

                              If there is no 3rd party models for registers =gt you must write the behavioral

                              description of registers

                              The behavioral description can be provided in the form of

                              subprograms(functions or procedures)

                              104 Behavioral VHDL Description

                              Circuit is described in terms of its operation over time

                              39

                              Representation might include eg state diagrams timing diagrams and

                              algorithmic descriptions

                              The concept of time may be expressed precisely using delays (eg A lt= B

                              after 10 ns)

                              If no actual delay is used order of sequential operations is defined

                              In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                              timing specifications

                              The actual timing results depend on implementation technology and efficiency

                              of synthesis tool

                              There are a few tools for behavioral synthesis

                              Concurrent Vs Sequential

                              Processes

                              Basic simulation concept in VHDL

                              VHDL description can always be broken up to interconnected processes

                              Quite similar to Unix process

                              40

                              Process keyword in VHDL

                              Process statement is concurrent statement

                              Statements inside process statements are sequential statements

                              Process must contain either sensitivity list or wait statement(s) but NOT both

                              Sensitivity list or wait statement(s) contains signals which wakes process up

                              General format

                              Process [(sensitivity_list)]

                              process_declarative_part

                              Begin

                              process_statements

                              [wait_statement]

                              End process

                              41

                              11 VGA CONTROLLER CODE

                              library IEEE

                              use IEEESTD_LOGIC_1164ALL

                              use IEEESTD_LOGIC_ARITHALL

                              use IEEESTD_LOGIC_UNSIGNEDALL

                              use IEEEnumeric_stdALL

                              ---- Uncomment the following library declaration if instantiating

                              ---- any Xilinx primitives in this code

                              --library UNISIM

                              --use UNISIMVComponentsall

                              entity vga_controller is port(clk50min std_logic

                              rstnin std_logic

                              hsyncout std_logic

                              vsyncout std_logic

                              redout std_logic_vector(0 downto 0)

                              greenout std_logic_vector(0 downto 0)

                              blueout std_logic_vector(0 downto 0))

                              end vga_controller

                              architecture Behavioral of vga_controller is

                              component memory

                              port (

                              addr IN std_logic_VECTOR(17 downto 0)

                              clk IN std_logic

                              dout OUT std_logic_VECTOR(0 downto 0)

                              en IN std_logic

                              sinit IN std_logic)

                              END component

                              42

                              component sync_unit port(

                              clkin std_logic

                              rstnin std_logic

                              hsyncout std_logic

                              ---horizontal synch pulse(row)

                              vsyncout std_logic

                              --vertical sync pulse(frame) 111110100000000000

                              v_en_regout std_logic

                              h_enout std_logic

                              v_enout std_logic

                              line_countinout integer

                              c_hsinout integer)

                              end component

                              component pixelgen port(

                              clkin std_logic

                              rstnin std_logic

                              hsyncin std_logic

                              vsyncin std_logic

                              h_enin std_logic

                              v_enin std_logic

                              v_en_reg in std_logic

                              data_inin std_logic_vector(0 downto 0)

                              line_countin integer

                              c_hsin integer

                              red out std_logic_vector(0 downto 0)

                              greenout std_logic_vector(0 downto 0)

                              blueout std_logic_vector(0 downto 0)

                              addrout std_logic_vector(17 downto 0)

                              rdout std_logic)

                              end component

                              43

                              signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                              signal line_count_sc_hs_saddrinteger

                              signal data_sstd_logic_vector(0 downto 0)

                              signal addr_sstd_logic_vector(17 downto 0)

                              begin

                              --addr_s lt= std_logic_vector(to_unsigned(addr18))

                              process(clk50mrstn)

                              begin

                              if(rstn = 0)then

                              clk lt= 0

                              elsif(clk50m = 1 and clk50mevent)then

                              clk lt= not clk

                              end if

                              --end if

                              end process

                              hsync lt= hsync_s

                              vsync lt= vsync_s

                              sync sync_unit port map(

                              clk =gt clk

                              rstn =gt rstn

                              hsync =gt hsync_s

                              vsync =gt vsync_s

                              v_en_reg =gt v_en_reg_s

                              line_count =gt line_count_s

                              h_en =gt h_en_s

                              v_en =gt v_en_s

                              c_hs =gt c_hs_s

                              )

                              44

                              pixel pixelgen port map(

                              clk =gt clk

                              rstn =gt rstn

                              hsync =gt hsync_s

                              vsync =gt vsync_s

                              h_en =gt h_en_s

                              v_en =gt v_en_s

                              v_en_reg =gt v_en_reg_s

                              data_in =gt data_s

                              line_count=gt line_count_s

                              c_hs =gt c_hs_s

                              red =gt red

                              green =gt green

                              blue =gt blue

                              addr =gt addr_s

                              rd =gt rd_s

                              )

                              mem12 memory port map(

                              addr =gt addr_s

                              clk =gt clk

                              dout =gt data_s

                              en =gt rd_s

                              sinit=gt rstn

                              )

                              end Behavioral

                              45

                              SYNCHRONIZATION BLOCK CODE

                              library IEEE

                              use IEEESTD_LOGIC_1164ALL

                              use IEEESTD_LOGIC_ARITHALL

                              use IEEESTD_LOGIC_UNSIGNEDALL

                              entity sync_unit is port(

                              clkin std_logic

                              rstnin std_logic

                              hsyncout std_logic ---horizontal synch pulse(row)

                              vsyncout std_logic --vertical sync

                              pulse(frame) 111110100000000000

                              v_en_regout std_logic

                              h_enout std_logic

                              v_enout std_logic

                              line_countout integer

                              c_hsout integer )

                              end sync_unit

                              architecture arch_sync_unit of sync_unit is

                              signal h_en_sv_en_sstd_logic

                              signal line_count_sc_hs_sinteger

                              begin

                              c_hs lt= c_hs_s

                              line_count lt= line_count_s

                              h_en lt= h_en_s

                              v_en lt= v_en_s

                              -- hsync counter

                              process(clkrstn)

                              begin

                              if(rstn = 0)then

                              46

                              c_hs_s lt= 0

                              else

                              if(clk = 1 and clkevent)then

                              if(c_hs_s=793)then

                              c_hs_s lt= 0

                              else

                              c_hs_s lt= c_hs_s +1

                              end if

                              end if

                              end if

                              end process

                              --vsync counter

                              --vertical line counter

                              process(h_en_srstn)

                              begin

                              if(rstn = 0)then

                              line_count_s lt= 0

                              elsif(h_en_s = 1 and h_en_sevent)then

                              line_count_s lt= line_count_s + 1

                              if (line_count_s=515)then

                              line_count_s lt= 0

                              end if

                              end if

                              --end if

                              end process

                              --hysnc pulse

                              process(clkrstn)

                              begin

                              if(rstn = 0)then

                              47

                              hsync lt= 1

                              elsif(clk = 1 and clkevent)then

                              if(c_hs_s lt=95)then

                              hsync lt= 0

                              else

                              hsync lt= 1

                              end if

                              end if

                              --end if

                              end process

                              process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                              this period)

                              begin

                              if(rstn = 0)then

                              h_en_s lt= 0

                              elsif(clk = 1 and clkevent)then

                              if((c_hs_s gt=143) and (c_hs_slt=778))then

                              h_en_s lt= 1

                              else

                              h_en_s lt= 0

                              end if

                              end if

                              --end if

                              end process

                              ---vysnc pulse

                              process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                              begin

                              if(rstn = 0)then

                              48

                              v_en_s lt= 0

                              elsif(clk = 1 and clkevent)then

                              if((line_count_sgt=35 and line_count_slt=515))then

                              v_en_s lt= 1

                              else

                              v_en_s lt= 0

                              end if

                              end if

                              --end if

                              end process

                              process(clkrstn) --Vertical Sync Front porch Back Porch

                              begin

                              if(rstn = 0)then

                              vsync lt= 1

                              elsif(clk = 1 and clkevent)then

                              if(line_count_s lt2)then

                              vsync lt= 0

                              else

                              vsync lt= 1

                              end if

                              end if

                              --end if

                              end process

                              process(clk)

                              begin

                              if(clk = 1 and clkevent)then

                              v_en_reglt=v_en_s

                              end if

                              end process

                              end arch_sync_unit

                              49

                              PIXEL GENERATOR

                              library IEEE

                              use IEEESTD_LOGIC_1164ALL

                              use IEEESTD_LOGIC_ARITHALL

                              use IEEESTD_LOGIC_UNSIGNEDALL

                              entity pixelgen is port(

                              clkin std_logic

                              rstnin std_logic

                              hsyncin std_logic

                              vsyncin std_logic

                              h_enin std_logic

                              v_enin std_logic

                              v_en_reg in std_logic

                              data_inin std_logic_vector(0 downto 0)

                              line_countin integer

                              c_hsin integer

                              red out std_logic_vector(0 downto 0)

                              greenout std_logic_vector(0 downto 0)

                              blueout std_logic_vector(0 downto 0)

                              addrout std_logic_vector(17 downto 0)

                              rdout std_logic

                              )

                              end pixelgen

                              architecture behv of pixelgen is

                              signal addr_sstd_logic_vector(17 downto 0)

                              begin

                              addr lt= addr_s

                              process(clkrstn)

                              begin

                              50

                              if(rstn = 0)then

                              rd lt= 0

                              red lt= 0

                              green lt= 0

                              blue lt= 0

                              addr_s lt= x0000 amp 00

                              elsif (clk = 1 and clkevent)then

                              if((v_en=1) and (v_en_reg=0))then

                              addr_slt=(others =gt0)

                              end if

                              if(v_en=1 and h_en=1)then

                              if(line_countlt=514)then

                              if(c_hs lt= 655)then

                              rd lt= 1

                              red lt= data_in

                              green lt= data_in

                              blue lt= data_in

                              addr_s lt=addr_s +000000000000000001

                              else

                              rd lt= 0

                              red lt= 0

                              green lt= 1

                              blue lt= 0

                              end if

                              else

                              rd lt= 0

                              red lt= 1

                              green lt= 1

                              blue lt= 1

                              51

                              end if

                              if(addr_s=111011111111111111)then

                              addr_slt= (others =gt0)

                              end if

                              else

                              red lt= 0

                              green lt= 0

                              blue lt= 0

                              end if

                              end if

                              -- end if

                              --end if

                              end process

                              end behv

                              52

                              12 FPGA ARCHITECTURE

                              A field-programmable gate array (FPGA) is a semiconductor device that can

                              be configured by the customer or designer after manufacturingmdashhence the

                              name field-programmable FPGAs are programmed using a logic circuit

                              diagram or a source code in a hardware description language (HDL) to specify

                              how the chip will work They can be used to implement any logical function

                              that an application-specific integrated circuit (ASIC) could perform but the

                              ability to update the functionality after shipping offers advantages for many

                              applications

                              FPGAs contain programmable logic components called logic blocks and a

                              hierarchy of reconfigurable interconnects that allow the blocks to be wired

                              togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                              can be configured to perform complex combinational functions or merely

                              simple logic gates like AND and XOR In most FPGAs the logic blocks also

                              include memory elements which may be simple flip-flops or more complete

                              blocks of memory

                              History The FPGA industry sprouted from programmable read only

                              memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                              both had the option of being programmed in batches in a factory or in the field

                              (field programmable) however programmable logic was hard-wired between

                              logic gates

                              Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                              first commercially viable field programmable gate array in 1985 ndash the

                              XC2064 The XC2064 had programmable gates and programmable

                              interconnects between gates the beginnings of a new technology and market

                              The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                              input lookup tables (LUTs) More than 20 years later Freeman was entered

                              into the National Inventors Hall of Fame for his invention

                              53

                              Some of the industryrsquos foundational concepts and technologies for

                              programmable logic arrays gates and logic blocks are founded in patents

                              awarded to David W Page and LuVerne R Peterson in 1985

                              In the late 1980s the Naval Surface Warfare Department funded an experiment

                              proposed by Steve Casselman to develop a computer that would implement

                              600000 reprogrammable gates Casselman was successful and the system was

                              awarded a patent in 1992

                              Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                              1990s when competitors sprouted up eroding significant market-share By

                              1993 Actel was serving about 18 percent of the market

                              The 1990s were an explosive period of time for FPGAs both in sophistication

                              and the volume of production In the early 1990s FPGAs were primarily used

                              in telecommunications and networking By the end of the decade FPGAs

                              found their way into consumer automotive and industrial applications

                              FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                              algorithm technology and FPGAs to create a sound recognition device

                              Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                              to decide the configuration needed to accomplish a sound recognition task

                              121 Modern developments

                              A recent trend has been to take the coarse-grained architectural

                              approach a step further by combining the logic blocks and interconnects of

                              traditional FPGAs with embedded microprocessors and related peripherals to

                              form a complete system on a programmable chip This work mirrors the

                              architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                              Group which combined a reconfigurable CPU architecture on a single chip

                              called the SB24 That work was done in 1982 Examples of such hybrid

                              technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                              which include one or more PowerPC processors embedded within the FPGAs

                              54

                              logic fabric The Atmel FPSLIC is another such device which uses an AVR

                              processor in combination with Atmels programmable logic architecture

                              An alternate approach to using hard-macro processors is to make use of soft

                              processor cores that are implemented within the FPGA logic (See Soft

                              processors below)

                              As previously mentioned many modern FPGAs have the ability to be

                              reprogrammed at run time and this is leading to the idea of reconfigurable

                              computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                              suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                              example of a reconfigurable soft processor implemented on FPGAs However

                              it does not support dynamic reconfiguration at runtime but instead adapts itself

                              to a specific program

                              Additionally new non-FPGA architectures are beginning to emerge

                              Software-configurable microprocessors such as the Stretch S5000 adopt a

                              hybrid approach by providing an array of processor cores and FPGA-like

                              programmable cores on the same chip

                              Gates

                              1987 9000 gates Xilinx

                              1992 600000 Naval Surface Warfare Department

                              Early 2000s Millions

                              Market size

                              1985 First commercial FPGA technology invented by Xilinx

                              1987 $14 million

                              ~1993 gt$385 million

                              2005 $19 billion

                              2010 estimates $275 billion

                              55

                              122 FPGA Comparisons

                              Historically FPGAs have been slower less energy efficient and generally

                              achieved less functionality than their fixed ASIC counterparts A combination

                              of volume fabrication improvements research and development and the IO

                              capabilities of new supercomputers have largely closed the performance gap

                              between ASICs and FPGAs

                              Advantages include a shorter time to market ability to re-program in the field

                              to fix bugs and lower non-recurring engineering costs Vendors can also take a

                              middle road by developing their hardware on ordinary FPGAs but

                              manufacture their final version so it can no longer be modified after the design

                              has been committed

                              Xilinx claims that several market and technology dynamics are changing the

                              ASICFPGA paradigm

                              IC costs are rising aggressively

                              ASIC complexity has bolstered development time and costs

                              RampD resources and headcount is decreasing

                              Revenue losses for slow time-to-market are increasing

                              Financial constraints in a poor economy are driving low-cost technologies

                              These trends make FPGAs a better alternative than ASICs for a growing

                              number of higher-volume applications than they have been historically used

                              for which the company blames for the growing number of FPGA design starts

                              (see History)

                              The primary differences between CPLDs and FPGAs are architectural A

                              CPLD has a somewhat restrictive structure consisting of one or more

                              programmable sum-of-products logic arrays feeding a relatively small number

                              of clocked registers The result of this is less flexibility with the advantage of

                              more predictable timing delays and a higher logic-to-interconnect ratio The

                              FPGA architectures on the other hand are dominated by interconnect This

                              56

                              makes them far more flexible (in terms of the range of designs that are

                              practical for implementation within them) but also far more complex to design

                              for

                              Another notable difference between CPLDs and FPGAs is the presence in most

                              FPGAs of higher-level embedded functions (such as adders and multipliers)

                              and embedded memories as well as to have logic blocks implement decoders

                              or mathematical functions

                              Some FPGAs have the capability of partial re-configuration that lets one

                              portion of the device be re-programmed while other portions continue running

                              123 Applications

                              Applications of FPGAs include digital signal processing software-defined

                              radio aerospace and defense systems ASIC prototyping medical imaging

                              computer vision speech recognition cryptography bioinformatics computer

                              hardware emulation radio astronomy and a growing range of other areas

                              FPGAs originally began as competitors to CPLDs and competed in a similar

                              space that of glue logic for PCBs As their size capabilities and speed

                              increased they began to take over larger and larger functions to the state where

                              some are now marketed as full systems on chips (SoC) Particularly with the

                              introduction of dedicated multipliers into FPGA architectures in the late 1990s

                              applications which had traditionally been the sole reserve of DSPs began to

                              incorporate FPGAs instead

                              FPGAs especially find applications in any area or algorithm that can make use

                              of the massive parallelism offered by their architecture One such area is code

                              breaking in particular brute-force attack of cryptographic algorithms

                              FPGAs are increasingly used in conventional high performance computing

                              applications where computational kernels such as FFT or Convolution are

                              performed on the FPGA instead of a microprocessor

                              57

                              The inherent parallelism of the logic resources on an FPGA allows for

                              considerable computational throughput even at a low MHz clock rates The

                              flexibility of the FPGA allows for even higher performance by trading off

                              precision and range in the number format for an increased number of parallel

                              arithmetic units This has driven a new type of processing called

                              reconfigurable computing where time intensive tasks are offloaded from

                              software to FPGAs

                              The adoption of FPGAs in high performance computing is currently limited by

                              the complexity of FPGA design compared to conventional software and the

                              extremely long turn-around times of current design tools where 4-8 hours wait

                              is necessary after even minor changes to the source code

                              Traditionally FPGAs have been reserved for specific vertical applications

                              where the volume of production is small For these low-volume applications

                              the premium that companies pay in hardware costs per unit for a

                              programmable chip is more affordable than the development resources spent

                              on creating an ASIC for a low-volume application Today new cost and

                              performance dynamics have broadened the range of viable applications

                              Architecture

                              The most common FPGA architecture consists of an array of configurable

                              logic blocks (CLBs) IO pads and routing channels Generally all the routing

                              channels have the same width (number of wires) Multiple IO pads may fit

                              into the height of one row or the width of one column in the array

                              An application circuit must be mapped into an FPGA with adequate resources

                              While the number of CLBs and IOs required is easily determined from the

                              design the number of routing tracks needed may vary considerably even

                              among designs with the same amount of logic (For example a crossbar switch

                              requires much more routing than a systolic array with the same gate count)

                              Since unused routing tracks increase the cost (and decrease the performance)

                              of the part without providing any benefit FPGA manufacturers try to provide

                              58

                              just enough tracks so that most designs that will fit in terms of LUTs and IOs

                              can be routed This is determined by estimates such as those derived from

                              Rents rule or by experiments with existing designs

                              The FPGA is an array or island-style FPGA It consists of an array of logic

                              blocks and routing channels Two IO pads fit into the height of one row or the

                              width of one column as shown below All the routing channels have the same

                              width (number of wires)

                              121 FPGA structure

                              A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                              flip-flop as shown below In recent years manufacturers have started moving

                              to 6-input LUTs in their high performance parts claiming increased

                              performance

                              Typical logic block

                              59

                              There is only one output which can be either the registered or the unregistered

                              LUT output The logic block has four inputs for the LUT and a clock input

                              Since clock signals (and often other high-fanout signals) are normally routed

                              via special-purpose dedicated routing networks in commercial FPGAs they

                              and other signals are separately managed

                              For this example architecture the locations of the FPGA logic block pins are

                              shown below

                              Logic Block Pin Locations

                              Each input is accessible from one side of the logic block while the output pin

                              can connect to routing wires in both the channel to the right and the channel

                              below the logic block

                              Each logic block output pin can connect to any of the wiring segments in the

                              channels adjacent to it

                              Similarly an IO pad can connect to any one of the wiring segments in the

                              channel adjacent to it For example an IO pad at the top of the chip can

                              connect to any of the W wires (where W is the channel width) in the horizontal

                              channel immediately below it

                              Generally the FPGA routing is unsegmented That is each wiring segment

                              spans only one logic block before it terminates in a switch box By turning on

                              60

                              some of the programmable switches within a switch box longer paths can be

                              constructed For higher speed interconnect some FPGA architectures use

                              longer routing lines that span multiple logic blocks

                              Whenever a vertical and a horizontal channel intersect there is a switch box In

                              this architecture when a wire enters a switch box there are three

                              programmable switches that allow it to connect to three other wires in adjacent

                              channel segments The pattern or topology of switches used in this

                              architecture is the planar or domain-based switch box topology In this switch

                              box topology a wire in track number one connects only to wires in track

                              number one in adjacent channel segments wires in track number 2 connect

                              only to other wires in track number 2 and so on The figure below illustrates

                              the connections in a switch box

                              Switch box topology

                              Modern FPGA families expand upon the above capabilities to include higher

                              level functionality fixed into the silicon Having these common functions

                              embedded into the silicon reduces the area required and gives those functions

                              increased speed compared to building them from primitives Examples of these

                              include multipliers generic DSP blocks embedded processors high speed IO

                              logic and embedded memories

                              61

                              FPGAs are also widely used for systems validation including pre-silicon

                              validation post-silicon validation and firmware development This allows chip

                              companies to validate their design before the chip is produced in the factory

                              reducing the time to market

                              124 FPGA Design and Programming

                              To define the behavior of the FPGA the user provides a hardware description

                              language (HDL) or a schematic design The HDL form might be easier to work

                              with when handling large structures because its possible to just specify them

                              numerically rather than having to draw every piece by hand On the other hand

                              schematic entry can allow for easier visualisation of a design

                              Then using an electronic design automation tool a technology-mapped netlist

                              is generated The netlist can then be fitted to the actual FPGA architecture

                              using a process called place-and-route usually performed by the FPGA

                              companys proprietary place-and-route software The user will validate the

                              map place and route results via timing analysis simulation and other

                              verification methodologies Once the design and validation process is

                              complete the binary file generated (also using the FPGA companys

                              proprietary software) is used to (re)configure the FPGA

                              Going from schematicHDL source files to actual configuration The source

                              files are fed to a software suite from the FPGACPLD vendor that through

                              different steps will produce a file This file is then transferred to the

                              FPGACPLD via a serial interface (JTAG) or to an external memory device

                              like an EEPROM

                              The most common HDLs are VHDL and Verilog although in an attempt to

                              reduce the complexity of designing in HDLs which have been compared to the

                              equivalent of assembly languages there are moves to raise the abstraction level

                              through the introduction of alternative languages

                              62

                              To simplify the design of complex systems in FPGAs there exist libraries of

                              predefined complex functions and circuits that have been tested and optimized

                              to speed up the design process These predefined circuits are commonly called

                              IP cores and are available from FPGA vendors and third-party IP suppliers

                              (rarely free and typically released under proprietary licenses) Other

                              predefined circuits are available from developer communities such as Open

                              Cores (typically free and released under the GPL BSD or similar license) and

                              other sources

                              In a typical design flow an FPGA application developer will simulate the

                              design at multiple stages throughout the design process Initially the RTL

                              description in VHDL or Verilog is simulated by creating test benches to

                              simulate the system and observe results Then after the synthesis engine has

                              mapped the design to a netlist the netlist is translated to a gate level

                              description where simulation is repeated to confirm the synthesis proceeded

                              without errors Finally the design is laid out in the FPGA at which point

                              propagation delays can be added and the simulation run again with these

                              values back-annotated onto the netlist

                              Basic Process Technology Types

                              SRAM - based on static memory technology In-system programmable and re-

                              programmable Requires external boot devices CMOS

                              Antifuse - One-time programmable CMOS

                              EPROM - Erasable Programmable Read-Only Memory technology Usually

                              one-time programmable in production because of plastic packaging

                              Windowed devices can be erased with ultraviolet (UV) light CMOS

                              EEPROM - Electrically Erasable Programmable Read-Only Memory

                              technology Can be erased even in plastic packages Some but not all

                              EEPROM devices can be in-system programmed CMOS

                              Flash - Flash-erase EPROM technology Can be erased even in plastic

                              packages Some but not all flash devices can be in-system programmed

                              63

                              Usually a flash cell is smaller than an equivalent EEPROM cell and is

                              therefore less expensive to manufacture CMOS

                              Fuse - One-time programmable Bipolar

                              Major Manufacturers

                              Xilinx and Altera are the current FPGA market leaders and long-time industry

                              rivals Together they control over 80 percent of the market with Xilinx alone

                              representing over 50 percent

                              Xilinx also provides free Windows and Linux design software while Altera

                              provides free Windows tools the Solaris and Linux tools are only available via

                              a rental scheme

                              Other competitors include Lattice Semiconductor (flash SRAM) Actel

                              (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                              Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                              focused CSSP no general purpose FPGAs)

                              125 FPGA prototype

                              FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                              prototyping is the method to prototype SoC and ASIC design on FPGA for

                              hardware verification and early software development

                              Main stream verification methods for hardware design and early software and

                              firmware co-design has become mainstream Prototyping SoC and ASIC

                              design on FPGA has become a good method to do this

                              64

                              Reasons why Prototyping is important

                              1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                              functionally correct This is compared to designers only relying on software

                              simulations to verify that their hardware design is sound Simulation speed and

                              modeling accuracy limitations hinder this development

                              2 Due to time constrains many projects cannot wait until the silicon is back from

                              the foundry to start on software tests FPGA prototyping allows for much more

                              time in area of software development and testing at the software-hardware

                              integration stage This allows many unforeseen software bugs that appear due

                              to todays array of operating systems applications and hardware

                              3 Prototyping also allows the developer to ensure that all IP technologies on his

                              system work well together off the simulation stage and in actual form

                              4 Prototyping has the added advantage as demo platforms to SoC clients

                              bringing in interest early This speeds up the overall development cycle and

                              allows for more enhancement or improvement to the chip features as it would

                              otherwise have been

                              65

                              13 SIMULATION RESULTS

                              66

                              GATE LEVEL

                              FigVGA CONTROLLER

                              67

                              GATE LEVEL

                              Fig SYNC UNIT

                              68

                              TECHNOLOGY SCHEMATIC

                              69

                              14 APPLICATIONS

                              Motion Tracking Systems

                              Image processing systems

                              Displaying Systems

                              141 CONCLUTION

                              Hence by implementing FPGA taking a processor-generated picture

                              (pixilated) from memory space the Controller provides digital RGB values for

                              each pixel as well as horizontal and vertical synchronization signals in order

                              to correctly display the picture on a connected monitor

                              70

                              REFERENCES

                              1 httpenwikipediaorgwikiVideo_Graphics_Array

                              2 A VGA display controller by Eduardo Sanchez

                              3 HinnerrdquoVGA Interface and video signal documentsrdquo

                              71

                              • Pixel
                                • 6 SIGNALS IN VGA CONTROLLER
                                • Gates
                                • Market size
                                • 122 FPGA Comparisons

                                G (Green)

                                B (Blue)

                                HSYNC (Horizontal Synchronous signal)

                                VSYNC (Vertical Synchronous signal)

                                All the above signals are used to control the image data to be correctly

                                displayed on the VGA display monitor

                                72 Timings for front port sync pulse back port-

                                Horizontal sync Vertical sync

                                Front port 094micros Front port 035ms

                                Sync pulse 377 micros Sync pulse 0063ms

                                Back port 189 micros Back port 102ms

                                16

                                8 FUNCTIONAL DESCRIPTION81 Symbol

                                VGA Controller Symbol

                                Pin description

                                The pin description is shown in the below table

                                Table VGA Pin description

                                Name Type Polarity Bus size Description

                                Global Control Signals

                                CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                                17

                                640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                                RST I High Global system reset

                                VGA Input Settings Signals

                                RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                                CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                                DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                                DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                                Data Memory Control Signals

                                RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                                18

                                the internal line enable signal en generated by the Synchronization Unit of the Controller

                                DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                                ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                                VGA Monitor Control Signals

                                HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                                VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                                19

                                VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                                R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                S

                                B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                20

                                9 VGA SIGNAL TIMING

                                Figure 3 summarizes the signal timing involved in sending a line of

                                pixels and a frame of lines The actual time values differ according to the

                                resolution selected (640x480 or 800x600) the processor-defined values for

                                DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                                (CLK)

                                21

                                91 VGA PIXEL UNIT

                                The Pixel Unit provides access to the pixilated image stored in external

                                picture memory reading in data a byte at a time and formatting each pixel to

                                be displayed For each pixel the 6-bit RGB value required for the control of

                                the monitorrsquos three primary color electron guns is generated so that the pixel

                                is displayed on the screen with the correct color

                                Accessing the stored image

                                The image to be displayed on the monitor screen is written by the host

                                microcontroller into external memory space (RAM) This memory space can

                                be located anywhere (eg using a block of RAM within the FPGA design or

                                using the SRAM on the NanoBoard itself)

                                Pixel data is stored in the memory space in bytes The number of pixels in a

                                byte depends on which color palette is being used for the image (selected by

                                the CMOD input)

                                bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                                bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                                bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                                (DATA[76] are not used)

                                22

                                The size of memory required to store a picture is determined by the total

                                number of viewable pixels in a line (determined by DISPSIZE_H) the total

                                number of viewable lines in a frame (determined by DISPSIZE_V) and the

                                number of pixels stored in each byte in memory space

                                Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                                The address in RAM where the next pixel is stored is determined using

                                an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                                reflects the current contents of the MEMADDR register The exact addressing

                                is described below

                                92 Black amp White mode

                                The picture memory address ndash the byte of data containing the next 8 pixels

                                ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                                three

                                ADDR_PIXEL = 00 amp hvcnt[183]

                                16 Colors mode

                                The picture memory address ndash the byte of data containing the next 2 pixels

                                ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                                one

                                ADDR_PIXEL = 0 amp hvcnt[181]

                                64 Colors mode

                                The picture memory address ndash the byte of data containing the next pixel ndash is

                                determined by using the full value of hvcnt

                                ADDR_PIXEL = hvcnt[180]

                                The Pixel register (PIXREG)

                                The Pixel register is used to receive the byte of pixel data read from the

                                current address in memory space The register PIXREG is reset to zero when

                                the VGA Controller receives an external reset signal (RST)

                                23

                                The register is updated on each rising edge of the CLK signal Data can be

                                read from the memory space as long as the RD signal is active (High) The RD

                                signal is itself controlled by the external line display enable signal enex This

                                internally generated signal is defined as follows

                                If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                                (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                                Else

                                enex = 1

                                When enex is Low read access from memory is disabled (RD = 0)

                                The point at which data is loaded from memory into PIXREG depends on the

                                particular color palette that is chosen ndash Black amp White 16 Colors or 64

                                Colors

                                Black amp White mode

                                The next byte of data will be loaded into the register whenever the

                                lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                                are ldquo000rdquo

                                For the currently loaded byte the active pixel is always in the lowest bit

                                position of the Pixel register Each pixel in the data byte is moved into this

                                active pixel position by shifting the contents of the register right by one bit on

                                each rising edge of CLK

                                16 Colors mode

                                The next byte of data will be loaded into the register whenever the

                                lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                                0

                                For the currently loaded byte the active pixel is always in the low order

                                nibble of the Pixel register Remember that in this mode each byte of data

                                contains two pixels The second pixel is moved into this active pixel position

                                by shifting the contents of the register right by four bits on the rising edge of

                                CLK

                                24

                                64 Colors mode

                                The next byte of data will be loaded into the register on the rising edge

                                of the external system clock signal (CLK) In this mode the read of pixel data

                                does not depend on the status of the hvcnt signal received from the

                                Synchronization Unit

                                93 The RGB register (RGBREG)

                                The RGB register is used to store the six bits that are required for

                                driving the red green and blue color guns of the target monitor When the

                                chosen color palette is either Black amp White or 16 Colors these six bits are

                                obtained by mapping the value of the active pixel to a predefined RGB code

                                When the chosen color palette is 64 Colors the actual pixel value is used

                                directly

                                The register RGBREG is reset to zero (000000) when the VGA

                                Controller receives an external reset signal (RST) This RGB code represents

                                black

                                The register is updated on each rising edge of the CLK signal and the

                                value loaded is dependent on the state of the line display enable signal en

                                When en is Low blanking is required and RGBREG is loaded with the code

                                for black (000000)

                                Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                25

                                26

                                27

                                The RGB color code stored in the RGB register is output from the VGA

                                Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                and B1)

                                The monitor itself expects analog signals as inputs to its electron gun control

                                circuits This is achieved by using 2-bit digital to analog converter circuitry

                                located on the Nano Board itself as shown in Figure 4

                                Figure 91 Digital to Analog RGB conversion

                                28

                                For each color the 2-bit digital signal from the VGA Controller can be

                                converted into 4 distinct analog levels These levels specify the intensity of

                                each of the three primary colors to use when displaying the pixel on the

                                monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                brightness)

                                With each analog input being one of four possible levels the monitor

                                can display each pixel on the screen with one of 64 different color

                                permutations

                                Using Block RAM in Spartan-3E Generation

                                FPGAs

                                For applications requiring large on-chip memories Spartantrade-3

                                Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                Using various configuration options Select RAM blocks create RAM ROM

                                FIFOs large look-up tables data width converterscircular buffers and shift

                                registers each supporting various data widths and depths This application note

                                describes the features and capabilities of block Select RAM and illustrates how

                                to specify the various options using the Xilinx CORE Generatortrade system or

                                via VHDL or Verilog instantiation

                                29

                                Table 2 Block RAM Available in Spartan-3E Devices

                                Devi

                                ce

                                R

                                A

                                M

                                Co

                                lu

                                m

                                ns

                                R

                                A

                                M

                                Bl

                                oc

                                ks

                                pe

                                r

                                C

                                ol

                                u

                                m

                                n

                                T

                                o

                                t

                                a

                                l

                                R

                                A

                                M

                                B

                                l

                                o

                                c

                                k

                                s

                                T

                                o

                                t

                                a

                                l

                                R

                                A

                                M

                                B

                                it

                                s

                                T

                                o

                                t

                                a

                                l

                                R

                                A

                                M

                                K

                                b

                                i

                                t

                                s

                                XC3

                                S100

                                E

                                1 4 4 7

                                3

                                7

                                2

                                8

                                7

                                2

                                XC3

                                S250

                                E

                                2 6 1

                                2

                                2

                                2

                                1

                                1

                                8

                                4

                                2

                                1

                                6

                                XC3

                                S500

                                E

                                2 10 2

                                0

                                3

                                6

                                8

                                6

                                4

                                3

                                6

                                0

                                30

                                0

                                XC3

                                S120

                                0E

                                2 14 2

                                8

                                5

                                1

                                6

                                0

                                9

                                6

                                5

                                0

                                4

                                XC3

                                S160

                                0E

                                2 18 3

                                6

                                6

                                6

                                3

                                5

                                5

                                2

                                6

                                4

                                8

                                Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                which is allocated to data storage and in some memory configurations an

                                additional 2K bits allocated to parity or additional plus data bits Physically

                                the block RAM memory has two completely independent access ports labeled

                                Port A and Port B The structure is fully symmetrical and both ports are

                                interchangeable and both ports support data read and write operations Each

                                memory port is synchronous with its own clock clock enable and write

                                enable Read operations are also synchronous and require a clock edge and

                                clock enable Though physically a dual-port memory block RAM simulates

                                single-port memory in an application as shown in Figure 1 Furthermore each

                                block memory supports multiple configurations or aspect ratios Table 3

                                summarizes the essential SelectRAM features Cascade multiple block RAMs

                                to create deeper and wider memory organizations with a minimal timing

                                penalty incurred through specialized routing resources

                                31

                                32

                                10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                Why (V) HDL

                                Interoperability

                                Technology independence

                                Design reuse

                                Several levels of abstraction

                                Readability

                                Standard language

                                Widely supported

                                What is VHDL

                                VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                IC)

                                Design specification language

                                Design entry language

                                Design simulation language

                                Design documentation language

                                An alternative to schematics

                                101 Brief History

                                VHDL Was developed in the early 1980s for managing design problems that

                                involved large circuits and multiple teams of engineers

                                Funded by US Department of Defence

                                33

                                The first publicly available version was released in 1985

                                In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                presented with a proposal to standardize the VHDL

                                In 1987 standardization =gt IEEE 1076-1987

                                An improved version of the language was released in 1994 =gt IEEE

                                standard1076-1993

                                102 Related Standards

                                IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                impedance

                                Soon after IEEE 1076-1987 was released simulator companies began using

                                their own non-standard types =gt VHDL was becoming a nonstandard

                                IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                definitions for a nine-valued data type std_logic

                                IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                to actual hardware

                                Defines eg two numeric types signed and unsigned

                                VHDL Environment

                                Design Units

                                34

                                Segments of VHDL code that can be compiled separately and stored in a

                                library

                                Entities

                                A black box with interface definition

                                Defines the inputsoutputs of a component (define pins)

                                A way to represent modularity in VHDL

                                Similar to symbol in schematic

                                Entity declaration describes entity

                                Eg

                                entity Comparator is

                                port (A B in std_logic_vector(7 downto0)

                                EQ out std_logic)

                                end Comparator

                                Ports Provide channels of communication between the component and its

                                environment

                                Each port must have a name direction and a type

                                An entity may have NO port declaration

                                35

                                Port directions In A value of a port can be read inside the component but cannot be assigned

                                Multiple reads of port are allowed

                                Out Assignments can be made to a port but data from a port cannot be read

                                Multiple assignments are allowed

                                In out Bi-directional assignments can be made and data can be read Multiple

                                assignments are allowed

                                Buffer An out port with read capability May have at most one assignment

                                (are not recommended)

                                Architectures Every entity has at least one architecture

                                One entity can have several architectures

                                Architectures can describe design using

                                BehaviorndashStructurendashDataflow

                                Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                Transfer Level)ndashBehavioral level

                                Configuration declaration links architecture to entity

                                Eg

                                Architecture Comparator1 of Comparator is

                                Begin

                                EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                End Comparator1

                                Configurations Links entity declaration and architecture body together

                                Concept of default configuration is a bit messy in VHDL lsquo87

                                ndashLast architecture analyzed links to entity

                                Can be used to change simulation behavior without re-analyzing the VHDL

                                source

                                Complex configuration declarations are ignored in synthesis

                                36

                                Some entities can have eggate level architecture and behavioral

                                architecture

                                Are always optional

                                PackagesPackages contain information common to many design units

                                1 Package declaration

                                --constant declarations

                                ndashtype and subtype declarations

                                ndashfunction and procedure declarations

                                ndashglobal signal declarations

                                ndashfile declarations

                                ndashcomponent declarations

                                2 Package body

                                ndashis not necessary needed

                                ndashfunction bodies

                                ndashprocedure bodies

                                Packages are meant for encapsuling data which can be shared globally among

                                Several design units

                                Consists of declaration part and optional body part

                                Package declaration can contain

                                ndashtype and subtype declarations

                                ndashsubprograms

                                ndashconstants

                                Alias declarations

                                ndashglobal signal declarations

                                ndashfile declarations

                                ndashcomponent declarations

                                37

                                Package body consists of

                                ndashsubprogram declarations and bodies

                                ndashtype and subtype declarations

                                ndash deferred constants

                                ndash file declarations

                                LibrariesCollection of VHDL design units (database)

                                1 Packages

                                package declaration

                                package body

                                2 Entities (entity declaration)

                                3 Architectures (architecture body)

                                4 Configurations (configuration declarations)

                                Usually directory in UNIX file system

                                Can be also any other kind of database

                                Levels of Abstraction

                                VHDL supports many possible styles of design description which differ

                                primarily in how closely they relate to the HW

                                It is possible to describe a circuit in a number of ways

                                Structural-------

                                Dataflow ------- Higher level of abstraction

                                Behavioral -------

                                Structural VHDL description

                                Circuit is described in terms of its components

                                From a low-level description (eg transistor-level description) to a high level

                                description (eg block diagram)

                                For large circuits a low-level description quickly becomes impractical

                                103 Dataflow VHDL Description

                                38

                                Circuit is described in terms of how data moves through the system

                                In the dataflow style you describe how information flows between registers in

                                the system

                                The combinational logic is described at a relatively high level the placement

                                and operation of registers is specified quite precisely

                                The behavior of the system over the time is defined by registers

                                There are no build-in registers in VHDL-language

                                ndashEither lower level description

                                ndashor behavioral description of sequential elements is needed

                                The lower level register descriptions must be created or obtained

                                If there is no 3rd party models for registers =gt you must write the behavioral

                                description of registers

                                The behavioral description can be provided in the form of

                                subprograms(functions or procedures)

                                104 Behavioral VHDL Description

                                Circuit is described in terms of its operation over time

                                39

                                Representation might include eg state diagrams timing diagrams and

                                algorithmic descriptions

                                The concept of time may be expressed precisely using delays (eg A lt= B

                                after 10 ns)

                                If no actual delay is used order of sequential operations is defined

                                In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                timing specifications

                                The actual timing results depend on implementation technology and efficiency

                                of synthesis tool

                                There are a few tools for behavioral synthesis

                                Concurrent Vs Sequential

                                Processes

                                Basic simulation concept in VHDL

                                VHDL description can always be broken up to interconnected processes

                                Quite similar to Unix process

                                40

                                Process keyword in VHDL

                                Process statement is concurrent statement

                                Statements inside process statements are sequential statements

                                Process must contain either sensitivity list or wait statement(s) but NOT both

                                Sensitivity list or wait statement(s) contains signals which wakes process up

                                General format

                                Process [(sensitivity_list)]

                                process_declarative_part

                                Begin

                                process_statements

                                [wait_statement]

                                End process

                                41

                                11 VGA CONTROLLER CODE

                                library IEEE

                                use IEEESTD_LOGIC_1164ALL

                                use IEEESTD_LOGIC_ARITHALL

                                use IEEESTD_LOGIC_UNSIGNEDALL

                                use IEEEnumeric_stdALL

                                ---- Uncomment the following library declaration if instantiating

                                ---- any Xilinx primitives in this code

                                --library UNISIM

                                --use UNISIMVComponentsall

                                entity vga_controller is port(clk50min std_logic

                                rstnin std_logic

                                hsyncout std_logic

                                vsyncout std_logic

                                redout std_logic_vector(0 downto 0)

                                greenout std_logic_vector(0 downto 0)

                                blueout std_logic_vector(0 downto 0))

                                end vga_controller

                                architecture Behavioral of vga_controller is

                                component memory

                                port (

                                addr IN std_logic_VECTOR(17 downto 0)

                                clk IN std_logic

                                dout OUT std_logic_VECTOR(0 downto 0)

                                en IN std_logic

                                sinit IN std_logic)

                                END component

                                42

                                component sync_unit port(

                                clkin std_logic

                                rstnin std_logic

                                hsyncout std_logic

                                ---horizontal synch pulse(row)

                                vsyncout std_logic

                                --vertical sync pulse(frame) 111110100000000000

                                v_en_regout std_logic

                                h_enout std_logic

                                v_enout std_logic

                                line_countinout integer

                                c_hsinout integer)

                                end component

                                component pixelgen port(

                                clkin std_logic

                                rstnin std_logic

                                hsyncin std_logic

                                vsyncin std_logic

                                h_enin std_logic

                                v_enin std_logic

                                v_en_reg in std_logic

                                data_inin std_logic_vector(0 downto 0)

                                line_countin integer

                                c_hsin integer

                                red out std_logic_vector(0 downto 0)

                                greenout std_logic_vector(0 downto 0)

                                blueout std_logic_vector(0 downto 0)

                                addrout std_logic_vector(17 downto 0)

                                rdout std_logic)

                                end component

                                43

                                signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                signal line_count_sc_hs_saddrinteger

                                signal data_sstd_logic_vector(0 downto 0)

                                signal addr_sstd_logic_vector(17 downto 0)

                                begin

                                --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                process(clk50mrstn)

                                begin

                                if(rstn = 0)then

                                clk lt= 0

                                elsif(clk50m = 1 and clk50mevent)then

                                clk lt= not clk

                                end if

                                --end if

                                end process

                                hsync lt= hsync_s

                                vsync lt= vsync_s

                                sync sync_unit port map(

                                clk =gt clk

                                rstn =gt rstn

                                hsync =gt hsync_s

                                vsync =gt vsync_s

                                v_en_reg =gt v_en_reg_s

                                line_count =gt line_count_s

                                h_en =gt h_en_s

                                v_en =gt v_en_s

                                c_hs =gt c_hs_s

                                )

                                44

                                pixel pixelgen port map(

                                clk =gt clk

                                rstn =gt rstn

                                hsync =gt hsync_s

                                vsync =gt vsync_s

                                h_en =gt h_en_s

                                v_en =gt v_en_s

                                v_en_reg =gt v_en_reg_s

                                data_in =gt data_s

                                line_count=gt line_count_s

                                c_hs =gt c_hs_s

                                red =gt red

                                green =gt green

                                blue =gt blue

                                addr =gt addr_s

                                rd =gt rd_s

                                )

                                mem12 memory port map(

                                addr =gt addr_s

                                clk =gt clk

                                dout =gt data_s

                                en =gt rd_s

                                sinit=gt rstn

                                )

                                end Behavioral

                                45

                                SYNCHRONIZATION BLOCK CODE

                                library IEEE

                                use IEEESTD_LOGIC_1164ALL

                                use IEEESTD_LOGIC_ARITHALL

                                use IEEESTD_LOGIC_UNSIGNEDALL

                                entity sync_unit is port(

                                clkin std_logic

                                rstnin std_logic

                                hsyncout std_logic ---horizontal synch pulse(row)

                                vsyncout std_logic --vertical sync

                                pulse(frame) 111110100000000000

                                v_en_regout std_logic

                                h_enout std_logic

                                v_enout std_logic

                                line_countout integer

                                c_hsout integer )

                                end sync_unit

                                architecture arch_sync_unit of sync_unit is

                                signal h_en_sv_en_sstd_logic

                                signal line_count_sc_hs_sinteger

                                begin

                                c_hs lt= c_hs_s

                                line_count lt= line_count_s

                                h_en lt= h_en_s

                                v_en lt= v_en_s

                                -- hsync counter

                                process(clkrstn)

                                begin

                                if(rstn = 0)then

                                46

                                c_hs_s lt= 0

                                else

                                if(clk = 1 and clkevent)then

                                if(c_hs_s=793)then

                                c_hs_s lt= 0

                                else

                                c_hs_s lt= c_hs_s +1

                                end if

                                end if

                                end if

                                end process

                                --vsync counter

                                --vertical line counter

                                process(h_en_srstn)

                                begin

                                if(rstn = 0)then

                                line_count_s lt= 0

                                elsif(h_en_s = 1 and h_en_sevent)then

                                line_count_s lt= line_count_s + 1

                                if (line_count_s=515)then

                                line_count_s lt= 0

                                end if

                                end if

                                --end if

                                end process

                                --hysnc pulse

                                process(clkrstn)

                                begin

                                if(rstn = 0)then

                                47

                                hsync lt= 1

                                elsif(clk = 1 and clkevent)then

                                if(c_hs_s lt=95)then

                                hsync lt= 0

                                else

                                hsync lt= 1

                                end if

                                end if

                                --end if

                                end process

                                process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                this period)

                                begin

                                if(rstn = 0)then

                                h_en_s lt= 0

                                elsif(clk = 1 and clkevent)then

                                if((c_hs_s gt=143) and (c_hs_slt=778))then

                                h_en_s lt= 1

                                else

                                h_en_s lt= 0

                                end if

                                end if

                                --end if

                                end process

                                ---vysnc pulse

                                process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                begin

                                if(rstn = 0)then

                                48

                                v_en_s lt= 0

                                elsif(clk = 1 and clkevent)then

                                if((line_count_sgt=35 and line_count_slt=515))then

                                v_en_s lt= 1

                                else

                                v_en_s lt= 0

                                end if

                                end if

                                --end if

                                end process

                                process(clkrstn) --Vertical Sync Front porch Back Porch

                                begin

                                if(rstn = 0)then

                                vsync lt= 1

                                elsif(clk = 1 and clkevent)then

                                if(line_count_s lt2)then

                                vsync lt= 0

                                else

                                vsync lt= 1

                                end if

                                end if

                                --end if

                                end process

                                process(clk)

                                begin

                                if(clk = 1 and clkevent)then

                                v_en_reglt=v_en_s

                                end if

                                end process

                                end arch_sync_unit

                                49

                                PIXEL GENERATOR

                                library IEEE

                                use IEEESTD_LOGIC_1164ALL

                                use IEEESTD_LOGIC_ARITHALL

                                use IEEESTD_LOGIC_UNSIGNEDALL

                                entity pixelgen is port(

                                clkin std_logic

                                rstnin std_logic

                                hsyncin std_logic

                                vsyncin std_logic

                                h_enin std_logic

                                v_enin std_logic

                                v_en_reg in std_logic

                                data_inin std_logic_vector(0 downto 0)

                                line_countin integer

                                c_hsin integer

                                red out std_logic_vector(0 downto 0)

                                greenout std_logic_vector(0 downto 0)

                                blueout std_logic_vector(0 downto 0)

                                addrout std_logic_vector(17 downto 0)

                                rdout std_logic

                                )

                                end pixelgen

                                architecture behv of pixelgen is

                                signal addr_sstd_logic_vector(17 downto 0)

                                begin

                                addr lt= addr_s

                                process(clkrstn)

                                begin

                                50

                                if(rstn = 0)then

                                rd lt= 0

                                red lt= 0

                                green lt= 0

                                blue lt= 0

                                addr_s lt= x0000 amp 00

                                elsif (clk = 1 and clkevent)then

                                if((v_en=1) and (v_en_reg=0))then

                                addr_slt=(others =gt0)

                                end if

                                if(v_en=1 and h_en=1)then

                                if(line_countlt=514)then

                                if(c_hs lt= 655)then

                                rd lt= 1

                                red lt= data_in

                                green lt= data_in

                                blue lt= data_in

                                addr_s lt=addr_s +000000000000000001

                                else

                                rd lt= 0

                                red lt= 0

                                green lt= 1

                                blue lt= 0

                                end if

                                else

                                rd lt= 0

                                red lt= 1

                                green lt= 1

                                blue lt= 1

                                51

                                end if

                                if(addr_s=111011111111111111)then

                                addr_slt= (others =gt0)

                                end if

                                else

                                red lt= 0

                                green lt= 0

                                blue lt= 0

                                end if

                                end if

                                -- end if

                                --end if

                                end process

                                end behv

                                52

                                12 FPGA ARCHITECTURE

                                A field-programmable gate array (FPGA) is a semiconductor device that can

                                be configured by the customer or designer after manufacturingmdashhence the

                                name field-programmable FPGAs are programmed using a logic circuit

                                diagram or a source code in a hardware description language (HDL) to specify

                                how the chip will work They can be used to implement any logical function

                                that an application-specific integrated circuit (ASIC) could perform but the

                                ability to update the functionality after shipping offers advantages for many

                                applications

                                FPGAs contain programmable logic components called logic blocks and a

                                hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                can be configured to perform complex combinational functions or merely

                                simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                include memory elements which may be simple flip-flops or more complete

                                blocks of memory

                                History The FPGA industry sprouted from programmable read only

                                memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                both had the option of being programmed in batches in a factory or in the field

                                (field programmable) however programmable logic was hard-wired between

                                logic gates

                                Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                first commercially viable field programmable gate array in 1985 ndash the

                                XC2064 The XC2064 had programmable gates and programmable

                                interconnects between gates the beginnings of a new technology and market

                                The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                input lookup tables (LUTs) More than 20 years later Freeman was entered

                                into the National Inventors Hall of Fame for his invention

                                53

                                Some of the industryrsquos foundational concepts and technologies for

                                programmable logic arrays gates and logic blocks are founded in patents

                                awarded to David W Page and LuVerne R Peterson in 1985

                                In the late 1980s the Naval Surface Warfare Department funded an experiment

                                proposed by Steve Casselman to develop a computer that would implement

                                600000 reprogrammable gates Casselman was successful and the system was

                                awarded a patent in 1992

                                Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                1990s when competitors sprouted up eroding significant market-share By

                                1993 Actel was serving about 18 percent of the market

                                The 1990s were an explosive period of time for FPGAs both in sophistication

                                and the volume of production In the early 1990s FPGAs were primarily used

                                in telecommunications and networking By the end of the decade FPGAs

                                found their way into consumer automotive and industrial applications

                                FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                algorithm technology and FPGAs to create a sound recognition device

                                Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                to decide the configuration needed to accomplish a sound recognition task

                                121 Modern developments

                                A recent trend has been to take the coarse-grained architectural

                                approach a step further by combining the logic blocks and interconnects of

                                traditional FPGAs with embedded microprocessors and related peripherals to

                                form a complete system on a programmable chip This work mirrors the

                                architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                Group which combined a reconfigurable CPU architecture on a single chip

                                called the SB24 That work was done in 1982 Examples of such hybrid

                                technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                which include one or more PowerPC processors embedded within the FPGAs

                                54

                                logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                processor in combination with Atmels programmable logic architecture

                                An alternate approach to using hard-macro processors is to make use of soft

                                processor cores that are implemented within the FPGA logic (See Soft

                                processors below)

                                As previously mentioned many modern FPGAs have the ability to be

                                reprogrammed at run time and this is leading to the idea of reconfigurable

                                computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                example of a reconfigurable soft processor implemented on FPGAs However

                                it does not support dynamic reconfiguration at runtime but instead adapts itself

                                to a specific program

                                Additionally new non-FPGA architectures are beginning to emerge

                                Software-configurable microprocessors such as the Stretch S5000 adopt a

                                hybrid approach by providing an array of processor cores and FPGA-like

                                programmable cores on the same chip

                                Gates

                                1987 9000 gates Xilinx

                                1992 600000 Naval Surface Warfare Department

                                Early 2000s Millions

                                Market size

                                1985 First commercial FPGA technology invented by Xilinx

                                1987 $14 million

                                ~1993 gt$385 million

                                2005 $19 billion

                                2010 estimates $275 billion

                                55

                                122 FPGA Comparisons

                                Historically FPGAs have been slower less energy efficient and generally

                                achieved less functionality than their fixed ASIC counterparts A combination

                                of volume fabrication improvements research and development and the IO

                                capabilities of new supercomputers have largely closed the performance gap

                                between ASICs and FPGAs

                                Advantages include a shorter time to market ability to re-program in the field

                                to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                middle road by developing their hardware on ordinary FPGAs but

                                manufacture their final version so it can no longer be modified after the design

                                has been committed

                                Xilinx claims that several market and technology dynamics are changing the

                                ASICFPGA paradigm

                                IC costs are rising aggressively

                                ASIC complexity has bolstered development time and costs

                                RampD resources and headcount is decreasing

                                Revenue losses for slow time-to-market are increasing

                                Financial constraints in a poor economy are driving low-cost technologies

                                These trends make FPGAs a better alternative than ASICs for a growing

                                number of higher-volume applications than they have been historically used

                                for which the company blames for the growing number of FPGA design starts

                                (see History)

                                The primary differences between CPLDs and FPGAs are architectural A

                                CPLD has a somewhat restrictive structure consisting of one or more

                                programmable sum-of-products logic arrays feeding a relatively small number

                                of clocked registers The result of this is less flexibility with the advantage of

                                more predictable timing delays and a higher logic-to-interconnect ratio The

                                FPGA architectures on the other hand are dominated by interconnect This

                                56

                                makes them far more flexible (in terms of the range of designs that are

                                practical for implementation within them) but also far more complex to design

                                for

                                Another notable difference between CPLDs and FPGAs is the presence in most

                                FPGAs of higher-level embedded functions (such as adders and multipliers)

                                and embedded memories as well as to have logic blocks implement decoders

                                or mathematical functions

                                Some FPGAs have the capability of partial re-configuration that lets one

                                portion of the device be re-programmed while other portions continue running

                                123 Applications

                                Applications of FPGAs include digital signal processing software-defined

                                radio aerospace and defense systems ASIC prototyping medical imaging

                                computer vision speech recognition cryptography bioinformatics computer

                                hardware emulation radio astronomy and a growing range of other areas

                                FPGAs originally began as competitors to CPLDs and competed in a similar

                                space that of glue logic for PCBs As their size capabilities and speed

                                increased they began to take over larger and larger functions to the state where

                                some are now marketed as full systems on chips (SoC) Particularly with the

                                introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                applications which had traditionally been the sole reserve of DSPs began to

                                incorporate FPGAs instead

                                FPGAs especially find applications in any area or algorithm that can make use

                                of the massive parallelism offered by their architecture One such area is code

                                breaking in particular brute-force attack of cryptographic algorithms

                                FPGAs are increasingly used in conventional high performance computing

                                applications where computational kernels such as FFT or Convolution are

                                performed on the FPGA instead of a microprocessor

                                57

                                The inherent parallelism of the logic resources on an FPGA allows for

                                considerable computational throughput even at a low MHz clock rates The

                                flexibility of the FPGA allows for even higher performance by trading off

                                precision and range in the number format for an increased number of parallel

                                arithmetic units This has driven a new type of processing called

                                reconfigurable computing where time intensive tasks are offloaded from

                                software to FPGAs

                                The adoption of FPGAs in high performance computing is currently limited by

                                the complexity of FPGA design compared to conventional software and the

                                extremely long turn-around times of current design tools where 4-8 hours wait

                                is necessary after even minor changes to the source code

                                Traditionally FPGAs have been reserved for specific vertical applications

                                where the volume of production is small For these low-volume applications

                                the premium that companies pay in hardware costs per unit for a

                                programmable chip is more affordable than the development resources spent

                                on creating an ASIC for a low-volume application Today new cost and

                                performance dynamics have broadened the range of viable applications

                                Architecture

                                The most common FPGA architecture consists of an array of configurable

                                logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                channels have the same width (number of wires) Multiple IO pads may fit

                                into the height of one row or the width of one column in the array

                                An application circuit must be mapped into an FPGA with adequate resources

                                While the number of CLBs and IOs required is easily determined from the

                                design the number of routing tracks needed may vary considerably even

                                among designs with the same amount of logic (For example a crossbar switch

                                requires much more routing than a systolic array with the same gate count)

                                Since unused routing tracks increase the cost (and decrease the performance)

                                of the part without providing any benefit FPGA manufacturers try to provide

                                58

                                just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                can be routed This is determined by estimates such as those derived from

                                Rents rule or by experiments with existing designs

                                The FPGA is an array or island-style FPGA It consists of an array of logic

                                blocks and routing channels Two IO pads fit into the height of one row or the

                                width of one column as shown below All the routing channels have the same

                                width (number of wires)

                                121 FPGA structure

                                A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                flip-flop as shown below In recent years manufacturers have started moving

                                to 6-input LUTs in their high performance parts claiming increased

                                performance

                                Typical logic block

                                59

                                There is only one output which can be either the registered or the unregistered

                                LUT output The logic block has four inputs for the LUT and a clock input

                                Since clock signals (and often other high-fanout signals) are normally routed

                                via special-purpose dedicated routing networks in commercial FPGAs they

                                and other signals are separately managed

                                For this example architecture the locations of the FPGA logic block pins are

                                shown below

                                Logic Block Pin Locations

                                Each input is accessible from one side of the logic block while the output pin

                                can connect to routing wires in both the channel to the right and the channel

                                below the logic block

                                Each logic block output pin can connect to any of the wiring segments in the

                                channels adjacent to it

                                Similarly an IO pad can connect to any one of the wiring segments in the

                                channel adjacent to it For example an IO pad at the top of the chip can

                                connect to any of the W wires (where W is the channel width) in the horizontal

                                channel immediately below it

                                Generally the FPGA routing is unsegmented That is each wiring segment

                                spans only one logic block before it terminates in a switch box By turning on

                                60

                                some of the programmable switches within a switch box longer paths can be

                                constructed For higher speed interconnect some FPGA architectures use

                                longer routing lines that span multiple logic blocks

                                Whenever a vertical and a horizontal channel intersect there is a switch box In

                                this architecture when a wire enters a switch box there are three

                                programmable switches that allow it to connect to three other wires in adjacent

                                channel segments The pattern or topology of switches used in this

                                architecture is the planar or domain-based switch box topology In this switch

                                box topology a wire in track number one connects only to wires in track

                                number one in adjacent channel segments wires in track number 2 connect

                                only to other wires in track number 2 and so on The figure below illustrates

                                the connections in a switch box

                                Switch box topology

                                Modern FPGA families expand upon the above capabilities to include higher

                                level functionality fixed into the silicon Having these common functions

                                embedded into the silicon reduces the area required and gives those functions

                                increased speed compared to building them from primitives Examples of these

                                include multipliers generic DSP blocks embedded processors high speed IO

                                logic and embedded memories

                                61

                                FPGAs are also widely used for systems validation including pre-silicon

                                validation post-silicon validation and firmware development This allows chip

                                companies to validate their design before the chip is produced in the factory

                                reducing the time to market

                                124 FPGA Design and Programming

                                To define the behavior of the FPGA the user provides a hardware description

                                language (HDL) or a schematic design The HDL form might be easier to work

                                with when handling large structures because its possible to just specify them

                                numerically rather than having to draw every piece by hand On the other hand

                                schematic entry can allow for easier visualisation of a design

                                Then using an electronic design automation tool a technology-mapped netlist

                                is generated The netlist can then be fitted to the actual FPGA architecture

                                using a process called place-and-route usually performed by the FPGA

                                companys proprietary place-and-route software The user will validate the

                                map place and route results via timing analysis simulation and other

                                verification methodologies Once the design and validation process is

                                complete the binary file generated (also using the FPGA companys

                                proprietary software) is used to (re)configure the FPGA

                                Going from schematicHDL source files to actual configuration The source

                                files are fed to a software suite from the FPGACPLD vendor that through

                                different steps will produce a file This file is then transferred to the

                                FPGACPLD via a serial interface (JTAG) or to an external memory device

                                like an EEPROM

                                The most common HDLs are VHDL and Verilog although in an attempt to

                                reduce the complexity of designing in HDLs which have been compared to the

                                equivalent of assembly languages there are moves to raise the abstraction level

                                through the introduction of alternative languages

                                62

                                To simplify the design of complex systems in FPGAs there exist libraries of

                                predefined complex functions and circuits that have been tested and optimized

                                to speed up the design process These predefined circuits are commonly called

                                IP cores and are available from FPGA vendors and third-party IP suppliers

                                (rarely free and typically released under proprietary licenses) Other

                                predefined circuits are available from developer communities such as Open

                                Cores (typically free and released under the GPL BSD or similar license) and

                                other sources

                                In a typical design flow an FPGA application developer will simulate the

                                design at multiple stages throughout the design process Initially the RTL

                                description in VHDL or Verilog is simulated by creating test benches to

                                simulate the system and observe results Then after the synthesis engine has

                                mapped the design to a netlist the netlist is translated to a gate level

                                description where simulation is repeated to confirm the synthesis proceeded

                                without errors Finally the design is laid out in the FPGA at which point

                                propagation delays can be added and the simulation run again with these

                                values back-annotated onto the netlist

                                Basic Process Technology Types

                                SRAM - based on static memory technology In-system programmable and re-

                                programmable Requires external boot devices CMOS

                                Antifuse - One-time programmable CMOS

                                EPROM - Erasable Programmable Read-Only Memory technology Usually

                                one-time programmable in production because of plastic packaging

                                Windowed devices can be erased with ultraviolet (UV) light CMOS

                                EEPROM - Electrically Erasable Programmable Read-Only Memory

                                technology Can be erased even in plastic packages Some but not all

                                EEPROM devices can be in-system programmed CMOS

                                Flash - Flash-erase EPROM technology Can be erased even in plastic

                                packages Some but not all flash devices can be in-system programmed

                                63

                                Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                therefore less expensive to manufacture CMOS

                                Fuse - One-time programmable Bipolar

                                Major Manufacturers

                                Xilinx and Altera are the current FPGA market leaders and long-time industry

                                rivals Together they control over 80 percent of the market with Xilinx alone

                                representing over 50 percent

                                Xilinx also provides free Windows and Linux design software while Altera

                                provides free Windows tools the Solaris and Linux tools are only available via

                                a rental scheme

                                Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                focused CSSP no general purpose FPGAs)

                                125 FPGA prototype

                                FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                prototyping is the method to prototype SoC and ASIC design on FPGA for

                                hardware verification and early software development

                                Main stream verification methods for hardware design and early software and

                                firmware co-design has become mainstream Prototyping SoC and ASIC

                                design on FPGA has become a good method to do this

                                64

                                Reasons why Prototyping is important

                                1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                functionally correct This is compared to designers only relying on software

                                simulations to verify that their hardware design is sound Simulation speed and

                                modeling accuracy limitations hinder this development

                                2 Due to time constrains many projects cannot wait until the silicon is back from

                                the foundry to start on software tests FPGA prototyping allows for much more

                                time in area of software development and testing at the software-hardware

                                integration stage This allows many unforeseen software bugs that appear due

                                to todays array of operating systems applications and hardware

                                3 Prototyping also allows the developer to ensure that all IP technologies on his

                                system work well together off the simulation stage and in actual form

                                4 Prototyping has the added advantage as demo platforms to SoC clients

                                bringing in interest early This speeds up the overall development cycle and

                                allows for more enhancement or improvement to the chip features as it would

                                otherwise have been

                                65

                                13 SIMULATION RESULTS

                                66

                                GATE LEVEL

                                FigVGA CONTROLLER

                                67

                                GATE LEVEL

                                Fig SYNC UNIT

                                68

                                TECHNOLOGY SCHEMATIC

                                69

                                14 APPLICATIONS

                                Motion Tracking Systems

                                Image processing systems

                                Displaying Systems

                                141 CONCLUTION

                                Hence by implementing FPGA taking a processor-generated picture

                                (pixilated) from memory space the Controller provides digital RGB values for

                                each pixel as well as horizontal and vertical synchronization signals in order

                                to correctly display the picture on a connected monitor

                                70

                                REFERENCES

                                1 httpenwikipediaorgwikiVideo_Graphics_Array

                                2 A VGA display controller by Eduardo Sanchez

                                3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                71

                                • Pixel
                                  • 6 SIGNALS IN VGA CONTROLLER
                                  • Gates
                                  • Market size
                                  • 122 FPGA Comparisons

                                  8 FUNCTIONAL DESCRIPTION81 Symbol

                                  VGA Controller Symbol

                                  Pin description

                                  The pin description is shown in the below table

                                  Table VGA Pin description

                                  Name Type Polarity Bus size Description

                                  Global Control Signals

                                  CLK I Rising Global system clock This clock determines the maximum rate at which pixels can be sent to the monitor The frequency of the clock signal determines the refresh rate as follows

                                  17

                                  640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                                  RST I High Global system reset

                                  VGA Input Settings Signals

                                  RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                                  CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                                  DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                                  DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                                  Data Memory Control Signals

                                  RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                                  18

                                  the internal line enable signal en generated by the Synchronization Unit of the Controller

                                  DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                                  ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                                  VGA Monitor Control Signals

                                  HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                                  VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                                  19

                                  VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                                  R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                  G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                  S

                                  B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                  20

                                  9 VGA SIGNAL TIMING

                                  Figure 3 summarizes the signal timing involved in sending a line of

                                  pixels and a frame of lines The actual time values differ according to the

                                  resolution selected (640x480 or 800x600) the processor-defined values for

                                  DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                                  (CLK)

                                  21

                                  91 VGA PIXEL UNIT

                                  The Pixel Unit provides access to the pixilated image stored in external

                                  picture memory reading in data a byte at a time and formatting each pixel to

                                  be displayed For each pixel the 6-bit RGB value required for the control of

                                  the monitorrsquos three primary color electron guns is generated so that the pixel

                                  is displayed on the screen with the correct color

                                  Accessing the stored image

                                  The image to be displayed on the monitor screen is written by the host

                                  microcontroller into external memory space (RAM) This memory space can

                                  be located anywhere (eg using a block of RAM within the FPGA design or

                                  using the SRAM on the NanoBoard itself)

                                  Pixel data is stored in the memory space in bytes The number of pixels in a

                                  byte depends on which color palette is being used for the image (selected by

                                  the CMOD input)

                                  bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                                  bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                                  bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                                  (DATA[76] are not used)

                                  22

                                  The size of memory required to store a picture is determined by the total

                                  number of viewable pixels in a line (determined by DISPSIZE_H) the total

                                  number of viewable lines in a frame (determined by DISPSIZE_V) and the

                                  number of pixels stored in each byte in memory space

                                  Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                                  The address in RAM where the next pixel is stored is determined using

                                  an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                                  reflects the current contents of the MEMADDR register The exact addressing

                                  is described below

                                  92 Black amp White mode

                                  The picture memory address ndash the byte of data containing the next 8 pixels

                                  ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                                  three

                                  ADDR_PIXEL = 00 amp hvcnt[183]

                                  16 Colors mode

                                  The picture memory address ndash the byte of data containing the next 2 pixels

                                  ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                                  one

                                  ADDR_PIXEL = 0 amp hvcnt[181]

                                  64 Colors mode

                                  The picture memory address ndash the byte of data containing the next pixel ndash is

                                  determined by using the full value of hvcnt

                                  ADDR_PIXEL = hvcnt[180]

                                  The Pixel register (PIXREG)

                                  The Pixel register is used to receive the byte of pixel data read from the

                                  current address in memory space The register PIXREG is reset to zero when

                                  the VGA Controller receives an external reset signal (RST)

                                  23

                                  The register is updated on each rising edge of the CLK signal Data can be

                                  read from the memory space as long as the RD signal is active (High) The RD

                                  signal is itself controlled by the external line display enable signal enex This

                                  internally generated signal is defined as follows

                                  If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                                  (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                                  Else

                                  enex = 1

                                  When enex is Low read access from memory is disabled (RD = 0)

                                  The point at which data is loaded from memory into PIXREG depends on the

                                  particular color palette that is chosen ndash Black amp White 16 Colors or 64

                                  Colors

                                  Black amp White mode

                                  The next byte of data will be loaded into the register whenever the

                                  lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                                  are ldquo000rdquo

                                  For the currently loaded byte the active pixel is always in the lowest bit

                                  position of the Pixel register Each pixel in the data byte is moved into this

                                  active pixel position by shifting the contents of the register right by one bit on

                                  each rising edge of CLK

                                  16 Colors mode

                                  The next byte of data will be loaded into the register whenever the

                                  lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                                  0

                                  For the currently loaded byte the active pixel is always in the low order

                                  nibble of the Pixel register Remember that in this mode each byte of data

                                  contains two pixels The second pixel is moved into this active pixel position

                                  by shifting the contents of the register right by four bits on the rising edge of

                                  CLK

                                  24

                                  64 Colors mode

                                  The next byte of data will be loaded into the register on the rising edge

                                  of the external system clock signal (CLK) In this mode the read of pixel data

                                  does not depend on the status of the hvcnt signal received from the

                                  Synchronization Unit

                                  93 The RGB register (RGBREG)

                                  The RGB register is used to store the six bits that are required for

                                  driving the red green and blue color guns of the target monitor When the

                                  chosen color palette is either Black amp White or 16 Colors these six bits are

                                  obtained by mapping the value of the active pixel to a predefined RGB code

                                  When the chosen color palette is 64 Colors the actual pixel value is used

                                  directly

                                  The register RGBREG is reset to zero (000000) when the VGA

                                  Controller receives an external reset signal (RST) This RGB code represents

                                  black

                                  The register is updated on each rising edge of the CLK signal and the

                                  value loaded is dependent on the state of the line display enable signal en

                                  When en is Low blanking is required and RGBREG is loaded with the code

                                  for black (000000)

                                  Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                  and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                  25

                                  26

                                  27

                                  The RGB color code stored in the RGB register is output from the VGA

                                  Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                  and B1)

                                  The monitor itself expects analog signals as inputs to its electron gun control

                                  circuits This is achieved by using 2-bit digital to analog converter circuitry

                                  located on the Nano Board itself as shown in Figure 4

                                  Figure 91 Digital to Analog RGB conversion

                                  28

                                  For each color the 2-bit digital signal from the VGA Controller can be

                                  converted into 4 distinct analog levels These levels specify the intensity of

                                  each of the three primary colors to use when displaying the pixel on the

                                  monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                  brightness)

                                  With each analog input being one of four possible levels the monitor

                                  can display each pixel on the screen with one of 64 different color

                                  permutations

                                  Using Block RAM in Spartan-3E Generation

                                  FPGAs

                                  For applications requiring large on-chip memories Spartantrade-3

                                  Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                  Using various configuration options Select RAM blocks create RAM ROM

                                  FIFOs large look-up tables data width converterscircular buffers and shift

                                  registers each supporting various data widths and depths This application note

                                  describes the features and capabilities of block Select RAM and illustrates how

                                  to specify the various options using the Xilinx CORE Generatortrade system or

                                  via VHDL or Verilog instantiation

                                  29

                                  Table 2 Block RAM Available in Spartan-3E Devices

                                  Devi

                                  ce

                                  R

                                  A

                                  M

                                  Co

                                  lu

                                  m

                                  ns

                                  R

                                  A

                                  M

                                  Bl

                                  oc

                                  ks

                                  pe

                                  r

                                  C

                                  ol

                                  u

                                  m

                                  n

                                  T

                                  o

                                  t

                                  a

                                  l

                                  R

                                  A

                                  M

                                  B

                                  l

                                  o

                                  c

                                  k

                                  s

                                  T

                                  o

                                  t

                                  a

                                  l

                                  R

                                  A

                                  M

                                  B

                                  it

                                  s

                                  T

                                  o

                                  t

                                  a

                                  l

                                  R

                                  A

                                  M

                                  K

                                  b

                                  i

                                  t

                                  s

                                  XC3

                                  S100

                                  E

                                  1 4 4 7

                                  3

                                  7

                                  2

                                  8

                                  7

                                  2

                                  XC3

                                  S250

                                  E

                                  2 6 1

                                  2

                                  2

                                  2

                                  1

                                  1

                                  8

                                  4

                                  2

                                  1

                                  6

                                  XC3

                                  S500

                                  E

                                  2 10 2

                                  0

                                  3

                                  6

                                  8

                                  6

                                  4

                                  3

                                  6

                                  0

                                  30

                                  0

                                  XC3

                                  S120

                                  0E

                                  2 14 2

                                  8

                                  5

                                  1

                                  6

                                  0

                                  9

                                  6

                                  5

                                  0

                                  4

                                  XC3

                                  S160

                                  0E

                                  2 18 3

                                  6

                                  6

                                  6

                                  3

                                  5

                                  5

                                  2

                                  6

                                  4

                                  8

                                  Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                  which is allocated to data storage and in some memory configurations an

                                  additional 2K bits allocated to parity or additional plus data bits Physically

                                  the block RAM memory has two completely independent access ports labeled

                                  Port A and Port B The structure is fully symmetrical and both ports are

                                  interchangeable and both ports support data read and write operations Each

                                  memory port is synchronous with its own clock clock enable and write

                                  enable Read operations are also synchronous and require a clock edge and

                                  clock enable Though physically a dual-port memory block RAM simulates

                                  single-port memory in an application as shown in Figure 1 Furthermore each

                                  block memory supports multiple configurations or aspect ratios Table 3

                                  summarizes the essential SelectRAM features Cascade multiple block RAMs

                                  to create deeper and wider memory organizations with a minimal timing

                                  penalty incurred through specialized routing resources

                                  31

                                  32

                                  10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                  Why (V) HDL

                                  Interoperability

                                  Technology independence

                                  Design reuse

                                  Several levels of abstraction

                                  Readability

                                  Standard language

                                  Widely supported

                                  What is VHDL

                                  VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                  IC)

                                  Design specification language

                                  Design entry language

                                  Design simulation language

                                  Design documentation language

                                  An alternative to schematics

                                  101 Brief History

                                  VHDL Was developed in the early 1980s for managing design problems that

                                  involved large circuits and multiple teams of engineers

                                  Funded by US Department of Defence

                                  33

                                  The first publicly available version was released in 1985

                                  In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                  presented with a proposal to standardize the VHDL

                                  In 1987 standardization =gt IEEE 1076-1987

                                  An improved version of the language was released in 1994 =gt IEEE

                                  standard1076-1993

                                  102 Related Standards

                                  IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                  impedance

                                  Soon after IEEE 1076-1987 was released simulator companies began using

                                  their own non-standard types =gt VHDL was becoming a nonstandard

                                  IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                  definitions for a nine-valued data type std_logic

                                  IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                  to actual hardware

                                  Defines eg two numeric types signed and unsigned

                                  VHDL Environment

                                  Design Units

                                  34

                                  Segments of VHDL code that can be compiled separately and stored in a

                                  library

                                  Entities

                                  A black box with interface definition

                                  Defines the inputsoutputs of a component (define pins)

                                  A way to represent modularity in VHDL

                                  Similar to symbol in schematic

                                  Entity declaration describes entity

                                  Eg

                                  entity Comparator is

                                  port (A B in std_logic_vector(7 downto0)

                                  EQ out std_logic)

                                  end Comparator

                                  Ports Provide channels of communication between the component and its

                                  environment

                                  Each port must have a name direction and a type

                                  An entity may have NO port declaration

                                  35

                                  Port directions In A value of a port can be read inside the component but cannot be assigned

                                  Multiple reads of port are allowed

                                  Out Assignments can be made to a port but data from a port cannot be read

                                  Multiple assignments are allowed

                                  In out Bi-directional assignments can be made and data can be read Multiple

                                  assignments are allowed

                                  Buffer An out port with read capability May have at most one assignment

                                  (are not recommended)

                                  Architectures Every entity has at least one architecture

                                  One entity can have several architectures

                                  Architectures can describe design using

                                  BehaviorndashStructurendashDataflow

                                  Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                  Transfer Level)ndashBehavioral level

                                  Configuration declaration links architecture to entity

                                  Eg

                                  Architecture Comparator1 of Comparator is

                                  Begin

                                  EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                  End Comparator1

                                  Configurations Links entity declaration and architecture body together

                                  Concept of default configuration is a bit messy in VHDL lsquo87

                                  ndashLast architecture analyzed links to entity

                                  Can be used to change simulation behavior without re-analyzing the VHDL

                                  source

                                  Complex configuration declarations are ignored in synthesis

                                  36

                                  Some entities can have eggate level architecture and behavioral

                                  architecture

                                  Are always optional

                                  PackagesPackages contain information common to many design units

                                  1 Package declaration

                                  --constant declarations

                                  ndashtype and subtype declarations

                                  ndashfunction and procedure declarations

                                  ndashglobal signal declarations

                                  ndashfile declarations

                                  ndashcomponent declarations

                                  2 Package body

                                  ndashis not necessary needed

                                  ndashfunction bodies

                                  ndashprocedure bodies

                                  Packages are meant for encapsuling data which can be shared globally among

                                  Several design units

                                  Consists of declaration part and optional body part

                                  Package declaration can contain

                                  ndashtype and subtype declarations

                                  ndashsubprograms

                                  ndashconstants

                                  Alias declarations

                                  ndashglobal signal declarations

                                  ndashfile declarations

                                  ndashcomponent declarations

                                  37

                                  Package body consists of

                                  ndashsubprogram declarations and bodies

                                  ndashtype and subtype declarations

                                  ndash deferred constants

                                  ndash file declarations

                                  LibrariesCollection of VHDL design units (database)

                                  1 Packages

                                  package declaration

                                  package body

                                  2 Entities (entity declaration)

                                  3 Architectures (architecture body)

                                  4 Configurations (configuration declarations)

                                  Usually directory in UNIX file system

                                  Can be also any other kind of database

                                  Levels of Abstraction

                                  VHDL supports many possible styles of design description which differ

                                  primarily in how closely they relate to the HW

                                  It is possible to describe a circuit in a number of ways

                                  Structural-------

                                  Dataflow ------- Higher level of abstraction

                                  Behavioral -------

                                  Structural VHDL description

                                  Circuit is described in terms of its components

                                  From a low-level description (eg transistor-level description) to a high level

                                  description (eg block diagram)

                                  For large circuits a low-level description quickly becomes impractical

                                  103 Dataflow VHDL Description

                                  38

                                  Circuit is described in terms of how data moves through the system

                                  In the dataflow style you describe how information flows between registers in

                                  the system

                                  The combinational logic is described at a relatively high level the placement

                                  and operation of registers is specified quite precisely

                                  The behavior of the system over the time is defined by registers

                                  There are no build-in registers in VHDL-language

                                  ndashEither lower level description

                                  ndashor behavioral description of sequential elements is needed

                                  The lower level register descriptions must be created or obtained

                                  If there is no 3rd party models for registers =gt you must write the behavioral

                                  description of registers

                                  The behavioral description can be provided in the form of

                                  subprograms(functions or procedures)

                                  104 Behavioral VHDL Description

                                  Circuit is described in terms of its operation over time

                                  39

                                  Representation might include eg state diagrams timing diagrams and

                                  algorithmic descriptions

                                  The concept of time may be expressed precisely using delays (eg A lt= B

                                  after 10 ns)

                                  If no actual delay is used order of sequential operations is defined

                                  In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                  timing specifications

                                  The actual timing results depend on implementation technology and efficiency

                                  of synthesis tool

                                  There are a few tools for behavioral synthesis

                                  Concurrent Vs Sequential

                                  Processes

                                  Basic simulation concept in VHDL

                                  VHDL description can always be broken up to interconnected processes

                                  Quite similar to Unix process

                                  40

                                  Process keyword in VHDL

                                  Process statement is concurrent statement

                                  Statements inside process statements are sequential statements

                                  Process must contain either sensitivity list or wait statement(s) but NOT both

                                  Sensitivity list or wait statement(s) contains signals which wakes process up

                                  General format

                                  Process [(sensitivity_list)]

                                  process_declarative_part

                                  Begin

                                  process_statements

                                  [wait_statement]

                                  End process

                                  41

                                  11 VGA CONTROLLER CODE

                                  library IEEE

                                  use IEEESTD_LOGIC_1164ALL

                                  use IEEESTD_LOGIC_ARITHALL

                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                  use IEEEnumeric_stdALL

                                  ---- Uncomment the following library declaration if instantiating

                                  ---- any Xilinx primitives in this code

                                  --library UNISIM

                                  --use UNISIMVComponentsall

                                  entity vga_controller is port(clk50min std_logic

                                  rstnin std_logic

                                  hsyncout std_logic

                                  vsyncout std_logic

                                  redout std_logic_vector(0 downto 0)

                                  greenout std_logic_vector(0 downto 0)

                                  blueout std_logic_vector(0 downto 0))

                                  end vga_controller

                                  architecture Behavioral of vga_controller is

                                  component memory

                                  port (

                                  addr IN std_logic_VECTOR(17 downto 0)

                                  clk IN std_logic

                                  dout OUT std_logic_VECTOR(0 downto 0)

                                  en IN std_logic

                                  sinit IN std_logic)

                                  END component

                                  42

                                  component sync_unit port(

                                  clkin std_logic

                                  rstnin std_logic

                                  hsyncout std_logic

                                  ---horizontal synch pulse(row)

                                  vsyncout std_logic

                                  --vertical sync pulse(frame) 111110100000000000

                                  v_en_regout std_logic

                                  h_enout std_logic

                                  v_enout std_logic

                                  line_countinout integer

                                  c_hsinout integer)

                                  end component

                                  component pixelgen port(

                                  clkin std_logic

                                  rstnin std_logic

                                  hsyncin std_logic

                                  vsyncin std_logic

                                  h_enin std_logic

                                  v_enin std_logic

                                  v_en_reg in std_logic

                                  data_inin std_logic_vector(0 downto 0)

                                  line_countin integer

                                  c_hsin integer

                                  red out std_logic_vector(0 downto 0)

                                  greenout std_logic_vector(0 downto 0)

                                  blueout std_logic_vector(0 downto 0)

                                  addrout std_logic_vector(17 downto 0)

                                  rdout std_logic)

                                  end component

                                  43

                                  signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                  signal line_count_sc_hs_saddrinteger

                                  signal data_sstd_logic_vector(0 downto 0)

                                  signal addr_sstd_logic_vector(17 downto 0)

                                  begin

                                  --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                  process(clk50mrstn)

                                  begin

                                  if(rstn = 0)then

                                  clk lt= 0

                                  elsif(clk50m = 1 and clk50mevent)then

                                  clk lt= not clk

                                  end if

                                  --end if

                                  end process

                                  hsync lt= hsync_s

                                  vsync lt= vsync_s

                                  sync sync_unit port map(

                                  clk =gt clk

                                  rstn =gt rstn

                                  hsync =gt hsync_s

                                  vsync =gt vsync_s

                                  v_en_reg =gt v_en_reg_s

                                  line_count =gt line_count_s

                                  h_en =gt h_en_s

                                  v_en =gt v_en_s

                                  c_hs =gt c_hs_s

                                  )

                                  44

                                  pixel pixelgen port map(

                                  clk =gt clk

                                  rstn =gt rstn

                                  hsync =gt hsync_s

                                  vsync =gt vsync_s

                                  h_en =gt h_en_s

                                  v_en =gt v_en_s

                                  v_en_reg =gt v_en_reg_s

                                  data_in =gt data_s

                                  line_count=gt line_count_s

                                  c_hs =gt c_hs_s

                                  red =gt red

                                  green =gt green

                                  blue =gt blue

                                  addr =gt addr_s

                                  rd =gt rd_s

                                  )

                                  mem12 memory port map(

                                  addr =gt addr_s

                                  clk =gt clk

                                  dout =gt data_s

                                  en =gt rd_s

                                  sinit=gt rstn

                                  )

                                  end Behavioral

                                  45

                                  SYNCHRONIZATION BLOCK CODE

                                  library IEEE

                                  use IEEESTD_LOGIC_1164ALL

                                  use IEEESTD_LOGIC_ARITHALL

                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                  entity sync_unit is port(

                                  clkin std_logic

                                  rstnin std_logic

                                  hsyncout std_logic ---horizontal synch pulse(row)

                                  vsyncout std_logic --vertical sync

                                  pulse(frame) 111110100000000000

                                  v_en_regout std_logic

                                  h_enout std_logic

                                  v_enout std_logic

                                  line_countout integer

                                  c_hsout integer )

                                  end sync_unit

                                  architecture arch_sync_unit of sync_unit is

                                  signal h_en_sv_en_sstd_logic

                                  signal line_count_sc_hs_sinteger

                                  begin

                                  c_hs lt= c_hs_s

                                  line_count lt= line_count_s

                                  h_en lt= h_en_s

                                  v_en lt= v_en_s

                                  -- hsync counter

                                  process(clkrstn)

                                  begin

                                  if(rstn = 0)then

                                  46

                                  c_hs_s lt= 0

                                  else

                                  if(clk = 1 and clkevent)then

                                  if(c_hs_s=793)then

                                  c_hs_s lt= 0

                                  else

                                  c_hs_s lt= c_hs_s +1

                                  end if

                                  end if

                                  end if

                                  end process

                                  --vsync counter

                                  --vertical line counter

                                  process(h_en_srstn)

                                  begin

                                  if(rstn = 0)then

                                  line_count_s lt= 0

                                  elsif(h_en_s = 1 and h_en_sevent)then

                                  line_count_s lt= line_count_s + 1

                                  if (line_count_s=515)then

                                  line_count_s lt= 0

                                  end if

                                  end if

                                  --end if

                                  end process

                                  --hysnc pulse

                                  process(clkrstn)

                                  begin

                                  if(rstn = 0)then

                                  47

                                  hsync lt= 1

                                  elsif(clk = 1 and clkevent)then

                                  if(c_hs_s lt=95)then

                                  hsync lt= 0

                                  else

                                  hsync lt= 1

                                  end if

                                  end if

                                  --end if

                                  end process

                                  process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                  this period)

                                  begin

                                  if(rstn = 0)then

                                  h_en_s lt= 0

                                  elsif(clk = 1 and clkevent)then

                                  if((c_hs_s gt=143) and (c_hs_slt=778))then

                                  h_en_s lt= 1

                                  else

                                  h_en_s lt= 0

                                  end if

                                  end if

                                  --end if

                                  end process

                                  ---vysnc pulse

                                  process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                  begin

                                  if(rstn = 0)then

                                  48

                                  v_en_s lt= 0

                                  elsif(clk = 1 and clkevent)then

                                  if((line_count_sgt=35 and line_count_slt=515))then

                                  v_en_s lt= 1

                                  else

                                  v_en_s lt= 0

                                  end if

                                  end if

                                  --end if

                                  end process

                                  process(clkrstn) --Vertical Sync Front porch Back Porch

                                  begin

                                  if(rstn = 0)then

                                  vsync lt= 1

                                  elsif(clk = 1 and clkevent)then

                                  if(line_count_s lt2)then

                                  vsync lt= 0

                                  else

                                  vsync lt= 1

                                  end if

                                  end if

                                  --end if

                                  end process

                                  process(clk)

                                  begin

                                  if(clk = 1 and clkevent)then

                                  v_en_reglt=v_en_s

                                  end if

                                  end process

                                  end arch_sync_unit

                                  49

                                  PIXEL GENERATOR

                                  library IEEE

                                  use IEEESTD_LOGIC_1164ALL

                                  use IEEESTD_LOGIC_ARITHALL

                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                  entity pixelgen is port(

                                  clkin std_logic

                                  rstnin std_logic

                                  hsyncin std_logic

                                  vsyncin std_logic

                                  h_enin std_logic

                                  v_enin std_logic

                                  v_en_reg in std_logic

                                  data_inin std_logic_vector(0 downto 0)

                                  line_countin integer

                                  c_hsin integer

                                  red out std_logic_vector(0 downto 0)

                                  greenout std_logic_vector(0 downto 0)

                                  blueout std_logic_vector(0 downto 0)

                                  addrout std_logic_vector(17 downto 0)

                                  rdout std_logic

                                  )

                                  end pixelgen

                                  architecture behv of pixelgen is

                                  signal addr_sstd_logic_vector(17 downto 0)

                                  begin

                                  addr lt= addr_s

                                  process(clkrstn)

                                  begin

                                  50

                                  if(rstn = 0)then

                                  rd lt= 0

                                  red lt= 0

                                  green lt= 0

                                  blue lt= 0

                                  addr_s lt= x0000 amp 00

                                  elsif (clk = 1 and clkevent)then

                                  if((v_en=1) and (v_en_reg=0))then

                                  addr_slt=(others =gt0)

                                  end if

                                  if(v_en=1 and h_en=1)then

                                  if(line_countlt=514)then

                                  if(c_hs lt= 655)then

                                  rd lt= 1

                                  red lt= data_in

                                  green lt= data_in

                                  blue lt= data_in

                                  addr_s lt=addr_s +000000000000000001

                                  else

                                  rd lt= 0

                                  red lt= 0

                                  green lt= 1

                                  blue lt= 0

                                  end if

                                  else

                                  rd lt= 0

                                  red lt= 1

                                  green lt= 1

                                  blue lt= 1

                                  51

                                  end if

                                  if(addr_s=111011111111111111)then

                                  addr_slt= (others =gt0)

                                  end if

                                  else

                                  red lt= 0

                                  green lt= 0

                                  blue lt= 0

                                  end if

                                  end if

                                  -- end if

                                  --end if

                                  end process

                                  end behv

                                  52

                                  12 FPGA ARCHITECTURE

                                  A field-programmable gate array (FPGA) is a semiconductor device that can

                                  be configured by the customer or designer after manufacturingmdashhence the

                                  name field-programmable FPGAs are programmed using a logic circuit

                                  diagram or a source code in a hardware description language (HDL) to specify

                                  how the chip will work They can be used to implement any logical function

                                  that an application-specific integrated circuit (ASIC) could perform but the

                                  ability to update the functionality after shipping offers advantages for many

                                  applications

                                  FPGAs contain programmable logic components called logic blocks and a

                                  hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                  togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                  can be configured to perform complex combinational functions or merely

                                  simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                  include memory elements which may be simple flip-flops or more complete

                                  blocks of memory

                                  History The FPGA industry sprouted from programmable read only

                                  memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                  both had the option of being programmed in batches in a factory or in the field

                                  (field programmable) however programmable logic was hard-wired between

                                  logic gates

                                  Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                  first commercially viable field programmable gate array in 1985 ndash the

                                  XC2064 The XC2064 had programmable gates and programmable

                                  interconnects between gates the beginnings of a new technology and market

                                  The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                  input lookup tables (LUTs) More than 20 years later Freeman was entered

                                  into the National Inventors Hall of Fame for his invention

                                  53

                                  Some of the industryrsquos foundational concepts and technologies for

                                  programmable logic arrays gates and logic blocks are founded in patents

                                  awarded to David W Page and LuVerne R Peterson in 1985

                                  In the late 1980s the Naval Surface Warfare Department funded an experiment

                                  proposed by Steve Casselman to develop a computer that would implement

                                  600000 reprogrammable gates Casselman was successful and the system was

                                  awarded a patent in 1992

                                  Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                  1990s when competitors sprouted up eroding significant market-share By

                                  1993 Actel was serving about 18 percent of the market

                                  The 1990s were an explosive period of time for FPGAs both in sophistication

                                  and the volume of production In the early 1990s FPGAs were primarily used

                                  in telecommunications and networking By the end of the decade FPGAs

                                  found their way into consumer automotive and industrial applications

                                  FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                  algorithm technology and FPGAs to create a sound recognition device

                                  Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                  to decide the configuration needed to accomplish a sound recognition task

                                  121 Modern developments

                                  A recent trend has been to take the coarse-grained architectural

                                  approach a step further by combining the logic blocks and interconnects of

                                  traditional FPGAs with embedded microprocessors and related peripherals to

                                  form a complete system on a programmable chip This work mirrors the

                                  architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                  Group which combined a reconfigurable CPU architecture on a single chip

                                  called the SB24 That work was done in 1982 Examples of such hybrid

                                  technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                  which include one or more PowerPC processors embedded within the FPGAs

                                  54

                                  logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                  processor in combination with Atmels programmable logic architecture

                                  An alternate approach to using hard-macro processors is to make use of soft

                                  processor cores that are implemented within the FPGA logic (See Soft

                                  processors below)

                                  As previously mentioned many modern FPGAs have the ability to be

                                  reprogrammed at run time and this is leading to the idea of reconfigurable

                                  computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                  suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                  example of a reconfigurable soft processor implemented on FPGAs However

                                  it does not support dynamic reconfiguration at runtime but instead adapts itself

                                  to a specific program

                                  Additionally new non-FPGA architectures are beginning to emerge

                                  Software-configurable microprocessors such as the Stretch S5000 adopt a

                                  hybrid approach by providing an array of processor cores and FPGA-like

                                  programmable cores on the same chip

                                  Gates

                                  1987 9000 gates Xilinx

                                  1992 600000 Naval Surface Warfare Department

                                  Early 2000s Millions

                                  Market size

                                  1985 First commercial FPGA technology invented by Xilinx

                                  1987 $14 million

                                  ~1993 gt$385 million

                                  2005 $19 billion

                                  2010 estimates $275 billion

                                  55

                                  122 FPGA Comparisons

                                  Historically FPGAs have been slower less energy efficient and generally

                                  achieved less functionality than their fixed ASIC counterparts A combination

                                  of volume fabrication improvements research and development and the IO

                                  capabilities of new supercomputers have largely closed the performance gap

                                  between ASICs and FPGAs

                                  Advantages include a shorter time to market ability to re-program in the field

                                  to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                  middle road by developing their hardware on ordinary FPGAs but

                                  manufacture their final version so it can no longer be modified after the design

                                  has been committed

                                  Xilinx claims that several market and technology dynamics are changing the

                                  ASICFPGA paradigm

                                  IC costs are rising aggressively

                                  ASIC complexity has bolstered development time and costs

                                  RampD resources and headcount is decreasing

                                  Revenue losses for slow time-to-market are increasing

                                  Financial constraints in a poor economy are driving low-cost technologies

                                  These trends make FPGAs a better alternative than ASICs for a growing

                                  number of higher-volume applications than they have been historically used

                                  for which the company blames for the growing number of FPGA design starts

                                  (see History)

                                  The primary differences between CPLDs and FPGAs are architectural A

                                  CPLD has a somewhat restrictive structure consisting of one or more

                                  programmable sum-of-products logic arrays feeding a relatively small number

                                  of clocked registers The result of this is less flexibility with the advantage of

                                  more predictable timing delays and a higher logic-to-interconnect ratio The

                                  FPGA architectures on the other hand are dominated by interconnect This

                                  56

                                  makes them far more flexible (in terms of the range of designs that are

                                  practical for implementation within them) but also far more complex to design

                                  for

                                  Another notable difference between CPLDs and FPGAs is the presence in most

                                  FPGAs of higher-level embedded functions (such as adders and multipliers)

                                  and embedded memories as well as to have logic blocks implement decoders

                                  or mathematical functions

                                  Some FPGAs have the capability of partial re-configuration that lets one

                                  portion of the device be re-programmed while other portions continue running

                                  123 Applications

                                  Applications of FPGAs include digital signal processing software-defined

                                  radio aerospace and defense systems ASIC prototyping medical imaging

                                  computer vision speech recognition cryptography bioinformatics computer

                                  hardware emulation radio astronomy and a growing range of other areas

                                  FPGAs originally began as competitors to CPLDs and competed in a similar

                                  space that of glue logic for PCBs As their size capabilities and speed

                                  increased they began to take over larger and larger functions to the state where

                                  some are now marketed as full systems on chips (SoC) Particularly with the

                                  introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                  applications which had traditionally been the sole reserve of DSPs began to

                                  incorporate FPGAs instead

                                  FPGAs especially find applications in any area or algorithm that can make use

                                  of the massive parallelism offered by their architecture One such area is code

                                  breaking in particular brute-force attack of cryptographic algorithms

                                  FPGAs are increasingly used in conventional high performance computing

                                  applications where computational kernels such as FFT or Convolution are

                                  performed on the FPGA instead of a microprocessor

                                  57

                                  The inherent parallelism of the logic resources on an FPGA allows for

                                  considerable computational throughput even at a low MHz clock rates The

                                  flexibility of the FPGA allows for even higher performance by trading off

                                  precision and range in the number format for an increased number of parallel

                                  arithmetic units This has driven a new type of processing called

                                  reconfigurable computing where time intensive tasks are offloaded from

                                  software to FPGAs

                                  The adoption of FPGAs in high performance computing is currently limited by

                                  the complexity of FPGA design compared to conventional software and the

                                  extremely long turn-around times of current design tools where 4-8 hours wait

                                  is necessary after even minor changes to the source code

                                  Traditionally FPGAs have been reserved for specific vertical applications

                                  where the volume of production is small For these low-volume applications

                                  the premium that companies pay in hardware costs per unit for a

                                  programmable chip is more affordable than the development resources spent

                                  on creating an ASIC for a low-volume application Today new cost and

                                  performance dynamics have broadened the range of viable applications

                                  Architecture

                                  The most common FPGA architecture consists of an array of configurable

                                  logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                  channels have the same width (number of wires) Multiple IO pads may fit

                                  into the height of one row or the width of one column in the array

                                  An application circuit must be mapped into an FPGA with adequate resources

                                  While the number of CLBs and IOs required is easily determined from the

                                  design the number of routing tracks needed may vary considerably even

                                  among designs with the same amount of logic (For example a crossbar switch

                                  requires much more routing than a systolic array with the same gate count)

                                  Since unused routing tracks increase the cost (and decrease the performance)

                                  of the part without providing any benefit FPGA manufacturers try to provide

                                  58

                                  just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                  can be routed This is determined by estimates such as those derived from

                                  Rents rule or by experiments with existing designs

                                  The FPGA is an array or island-style FPGA It consists of an array of logic

                                  blocks and routing channels Two IO pads fit into the height of one row or the

                                  width of one column as shown below All the routing channels have the same

                                  width (number of wires)

                                  121 FPGA structure

                                  A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                  flip-flop as shown below In recent years manufacturers have started moving

                                  to 6-input LUTs in their high performance parts claiming increased

                                  performance

                                  Typical logic block

                                  59

                                  There is only one output which can be either the registered or the unregistered

                                  LUT output The logic block has four inputs for the LUT and a clock input

                                  Since clock signals (and often other high-fanout signals) are normally routed

                                  via special-purpose dedicated routing networks in commercial FPGAs they

                                  and other signals are separately managed

                                  For this example architecture the locations of the FPGA logic block pins are

                                  shown below

                                  Logic Block Pin Locations

                                  Each input is accessible from one side of the logic block while the output pin

                                  can connect to routing wires in both the channel to the right and the channel

                                  below the logic block

                                  Each logic block output pin can connect to any of the wiring segments in the

                                  channels adjacent to it

                                  Similarly an IO pad can connect to any one of the wiring segments in the

                                  channel adjacent to it For example an IO pad at the top of the chip can

                                  connect to any of the W wires (where W is the channel width) in the horizontal

                                  channel immediately below it

                                  Generally the FPGA routing is unsegmented That is each wiring segment

                                  spans only one logic block before it terminates in a switch box By turning on

                                  60

                                  some of the programmable switches within a switch box longer paths can be

                                  constructed For higher speed interconnect some FPGA architectures use

                                  longer routing lines that span multiple logic blocks

                                  Whenever a vertical and a horizontal channel intersect there is a switch box In

                                  this architecture when a wire enters a switch box there are three

                                  programmable switches that allow it to connect to three other wires in adjacent

                                  channel segments The pattern or topology of switches used in this

                                  architecture is the planar or domain-based switch box topology In this switch

                                  box topology a wire in track number one connects only to wires in track

                                  number one in adjacent channel segments wires in track number 2 connect

                                  only to other wires in track number 2 and so on The figure below illustrates

                                  the connections in a switch box

                                  Switch box topology

                                  Modern FPGA families expand upon the above capabilities to include higher

                                  level functionality fixed into the silicon Having these common functions

                                  embedded into the silicon reduces the area required and gives those functions

                                  increased speed compared to building them from primitives Examples of these

                                  include multipliers generic DSP blocks embedded processors high speed IO

                                  logic and embedded memories

                                  61

                                  FPGAs are also widely used for systems validation including pre-silicon

                                  validation post-silicon validation and firmware development This allows chip

                                  companies to validate their design before the chip is produced in the factory

                                  reducing the time to market

                                  124 FPGA Design and Programming

                                  To define the behavior of the FPGA the user provides a hardware description

                                  language (HDL) or a schematic design The HDL form might be easier to work

                                  with when handling large structures because its possible to just specify them

                                  numerically rather than having to draw every piece by hand On the other hand

                                  schematic entry can allow for easier visualisation of a design

                                  Then using an electronic design automation tool a technology-mapped netlist

                                  is generated The netlist can then be fitted to the actual FPGA architecture

                                  using a process called place-and-route usually performed by the FPGA

                                  companys proprietary place-and-route software The user will validate the

                                  map place and route results via timing analysis simulation and other

                                  verification methodologies Once the design and validation process is

                                  complete the binary file generated (also using the FPGA companys

                                  proprietary software) is used to (re)configure the FPGA

                                  Going from schematicHDL source files to actual configuration The source

                                  files are fed to a software suite from the FPGACPLD vendor that through

                                  different steps will produce a file This file is then transferred to the

                                  FPGACPLD via a serial interface (JTAG) or to an external memory device

                                  like an EEPROM

                                  The most common HDLs are VHDL and Verilog although in an attempt to

                                  reduce the complexity of designing in HDLs which have been compared to the

                                  equivalent of assembly languages there are moves to raise the abstraction level

                                  through the introduction of alternative languages

                                  62

                                  To simplify the design of complex systems in FPGAs there exist libraries of

                                  predefined complex functions and circuits that have been tested and optimized

                                  to speed up the design process These predefined circuits are commonly called

                                  IP cores and are available from FPGA vendors and third-party IP suppliers

                                  (rarely free and typically released under proprietary licenses) Other

                                  predefined circuits are available from developer communities such as Open

                                  Cores (typically free and released under the GPL BSD or similar license) and

                                  other sources

                                  In a typical design flow an FPGA application developer will simulate the

                                  design at multiple stages throughout the design process Initially the RTL

                                  description in VHDL or Verilog is simulated by creating test benches to

                                  simulate the system and observe results Then after the synthesis engine has

                                  mapped the design to a netlist the netlist is translated to a gate level

                                  description where simulation is repeated to confirm the synthesis proceeded

                                  without errors Finally the design is laid out in the FPGA at which point

                                  propagation delays can be added and the simulation run again with these

                                  values back-annotated onto the netlist

                                  Basic Process Technology Types

                                  SRAM - based on static memory technology In-system programmable and re-

                                  programmable Requires external boot devices CMOS

                                  Antifuse - One-time programmable CMOS

                                  EPROM - Erasable Programmable Read-Only Memory technology Usually

                                  one-time programmable in production because of plastic packaging

                                  Windowed devices can be erased with ultraviolet (UV) light CMOS

                                  EEPROM - Electrically Erasable Programmable Read-Only Memory

                                  technology Can be erased even in plastic packages Some but not all

                                  EEPROM devices can be in-system programmed CMOS

                                  Flash - Flash-erase EPROM technology Can be erased even in plastic

                                  packages Some but not all flash devices can be in-system programmed

                                  63

                                  Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                  therefore less expensive to manufacture CMOS

                                  Fuse - One-time programmable Bipolar

                                  Major Manufacturers

                                  Xilinx and Altera are the current FPGA market leaders and long-time industry

                                  rivals Together they control over 80 percent of the market with Xilinx alone

                                  representing over 50 percent

                                  Xilinx also provides free Windows and Linux design software while Altera

                                  provides free Windows tools the Solaris and Linux tools are only available via

                                  a rental scheme

                                  Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                  (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                  Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                  focused CSSP no general purpose FPGAs)

                                  125 FPGA prototype

                                  FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                  prototyping is the method to prototype SoC and ASIC design on FPGA for

                                  hardware verification and early software development

                                  Main stream verification methods for hardware design and early software and

                                  firmware co-design has become mainstream Prototyping SoC and ASIC

                                  design on FPGA has become a good method to do this

                                  64

                                  Reasons why Prototyping is important

                                  1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                  functionally correct This is compared to designers only relying on software

                                  simulations to verify that their hardware design is sound Simulation speed and

                                  modeling accuracy limitations hinder this development

                                  2 Due to time constrains many projects cannot wait until the silicon is back from

                                  the foundry to start on software tests FPGA prototyping allows for much more

                                  time in area of software development and testing at the software-hardware

                                  integration stage This allows many unforeseen software bugs that appear due

                                  to todays array of operating systems applications and hardware

                                  3 Prototyping also allows the developer to ensure that all IP technologies on his

                                  system work well together off the simulation stage and in actual form

                                  4 Prototyping has the added advantage as demo platforms to SoC clients

                                  bringing in interest early This speeds up the overall development cycle and

                                  allows for more enhancement or improvement to the chip features as it would

                                  otherwise have been

                                  65

                                  13 SIMULATION RESULTS

                                  66

                                  GATE LEVEL

                                  FigVGA CONTROLLER

                                  67

                                  GATE LEVEL

                                  Fig SYNC UNIT

                                  68

                                  TECHNOLOGY SCHEMATIC

                                  69

                                  14 APPLICATIONS

                                  Motion Tracking Systems

                                  Image processing systems

                                  Displaying Systems

                                  141 CONCLUTION

                                  Hence by implementing FPGA taking a processor-generated picture

                                  (pixilated) from memory space the Controller provides digital RGB values for

                                  each pixel as well as horizontal and vertical synchronization signals in order

                                  to correctly display the picture on a connected monitor

                                  70

                                  REFERENCES

                                  1 httpenwikipediaorgwikiVideo_Graphics_Array

                                  2 A VGA display controller by Eduardo Sanchez

                                  3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                  71

                                  • Pixel
                                    • 6 SIGNALS IN VGA CONTROLLER
                                    • Gates
                                    • Market size
                                    • 122 FPGA Comparisons

                                    640x480 CLK = 25MHz Refresh = 60Hz CLK = 30MHz Refresh = 72Hz 800x600 CLK = 40MHz Refresh = 60Hz CLK = 50MHz Refresh = 72Hz

                                    RST I High Global system reset

                                    VGA Input Settings Signals

                                    RESOLUTION I High Low This input selects the screen resolution to be used 1 = 640x480 (VGA) 0 = 800x600 (SVGA)

                                    CMOD I 2 This input selects the color palette to be used 00 = Black amp White 01 = 16 Colors 10 = 64 Colors

                                    DISPSIZE_H I 10 This input determines the number of viewable pixels to be displayed in each line of a frame and is therefore used to control the horizontal extents of the visible display area

                                    DISPSIZE_V I 10 This input determines the number of lines to be displayed in a frame and is therefore used to control the vertical extents of the visible display area

                                    Data Memory Control Signals

                                    RD O High This is the enable signal when data is required to be read from the memory space This signal is controlled by and follows

                                    18

                                    the internal line enable signal en generated by the Synchronization Unit of the Controller

                                    DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                                    ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                                    VGA Monitor Control Signals

                                    HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                                    VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                                    19

                                    VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                                    R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                    G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                    S

                                    B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                    20

                                    9 VGA SIGNAL TIMING

                                    Figure 3 summarizes the signal timing involved in sending a line of

                                    pixels and a frame of lines The actual time values differ according to the

                                    resolution selected (640x480 or 800x600) the processor-defined values for

                                    DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                                    (CLK)

                                    21

                                    91 VGA PIXEL UNIT

                                    The Pixel Unit provides access to the pixilated image stored in external

                                    picture memory reading in data a byte at a time and formatting each pixel to

                                    be displayed For each pixel the 6-bit RGB value required for the control of

                                    the monitorrsquos three primary color electron guns is generated so that the pixel

                                    is displayed on the screen with the correct color

                                    Accessing the stored image

                                    The image to be displayed on the monitor screen is written by the host

                                    microcontroller into external memory space (RAM) This memory space can

                                    be located anywhere (eg using a block of RAM within the FPGA design or

                                    using the SRAM on the NanoBoard itself)

                                    Pixel data is stored in the memory space in bytes The number of pixels in a

                                    byte depends on which color palette is being used for the image (selected by

                                    the CMOD input)

                                    bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                                    bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                                    bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                                    (DATA[76] are not used)

                                    22

                                    The size of memory required to store a picture is determined by the total

                                    number of viewable pixels in a line (determined by DISPSIZE_H) the total

                                    number of viewable lines in a frame (determined by DISPSIZE_V) and the

                                    number of pixels stored in each byte in memory space

                                    Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                                    The address in RAM where the next pixel is stored is determined using

                                    an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                                    reflects the current contents of the MEMADDR register The exact addressing

                                    is described below

                                    92 Black amp White mode

                                    The picture memory address ndash the byte of data containing the next 8 pixels

                                    ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                                    three

                                    ADDR_PIXEL = 00 amp hvcnt[183]

                                    16 Colors mode

                                    The picture memory address ndash the byte of data containing the next 2 pixels

                                    ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                                    one

                                    ADDR_PIXEL = 0 amp hvcnt[181]

                                    64 Colors mode

                                    The picture memory address ndash the byte of data containing the next pixel ndash is

                                    determined by using the full value of hvcnt

                                    ADDR_PIXEL = hvcnt[180]

                                    The Pixel register (PIXREG)

                                    The Pixel register is used to receive the byte of pixel data read from the

                                    current address in memory space The register PIXREG is reset to zero when

                                    the VGA Controller receives an external reset signal (RST)

                                    23

                                    The register is updated on each rising edge of the CLK signal Data can be

                                    read from the memory space as long as the RD signal is active (High) The RD

                                    signal is itself controlled by the external line display enable signal enex This

                                    internally generated signal is defined as follows

                                    If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                                    (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                                    Else

                                    enex = 1

                                    When enex is Low read access from memory is disabled (RD = 0)

                                    The point at which data is loaded from memory into PIXREG depends on the

                                    particular color palette that is chosen ndash Black amp White 16 Colors or 64

                                    Colors

                                    Black amp White mode

                                    The next byte of data will be loaded into the register whenever the

                                    lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                                    are ldquo000rdquo

                                    For the currently loaded byte the active pixel is always in the lowest bit

                                    position of the Pixel register Each pixel in the data byte is moved into this

                                    active pixel position by shifting the contents of the register right by one bit on

                                    each rising edge of CLK

                                    16 Colors mode

                                    The next byte of data will be loaded into the register whenever the

                                    lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                                    0

                                    For the currently loaded byte the active pixel is always in the low order

                                    nibble of the Pixel register Remember that in this mode each byte of data

                                    contains two pixels The second pixel is moved into this active pixel position

                                    by shifting the contents of the register right by four bits on the rising edge of

                                    CLK

                                    24

                                    64 Colors mode

                                    The next byte of data will be loaded into the register on the rising edge

                                    of the external system clock signal (CLK) In this mode the read of pixel data

                                    does not depend on the status of the hvcnt signal received from the

                                    Synchronization Unit

                                    93 The RGB register (RGBREG)

                                    The RGB register is used to store the six bits that are required for

                                    driving the red green and blue color guns of the target monitor When the

                                    chosen color palette is either Black amp White or 16 Colors these six bits are

                                    obtained by mapping the value of the active pixel to a predefined RGB code

                                    When the chosen color palette is 64 Colors the actual pixel value is used

                                    directly

                                    The register RGBREG is reset to zero (000000) when the VGA

                                    Controller receives an external reset signal (RST) This RGB code represents

                                    black

                                    The register is updated on each rising edge of the CLK signal and the

                                    value loaded is dependent on the state of the line display enable signal en

                                    When en is Low blanking is required and RGBREG is loaded with the code

                                    for black (000000)

                                    Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                    and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                    25

                                    26

                                    27

                                    The RGB color code stored in the RGB register is output from the VGA

                                    Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                    and B1)

                                    The monitor itself expects analog signals as inputs to its electron gun control

                                    circuits This is achieved by using 2-bit digital to analog converter circuitry

                                    located on the Nano Board itself as shown in Figure 4

                                    Figure 91 Digital to Analog RGB conversion

                                    28

                                    For each color the 2-bit digital signal from the VGA Controller can be

                                    converted into 4 distinct analog levels These levels specify the intensity of

                                    each of the three primary colors to use when displaying the pixel on the

                                    monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                    brightness)

                                    With each analog input being one of four possible levels the monitor

                                    can display each pixel on the screen with one of 64 different color

                                    permutations

                                    Using Block RAM in Spartan-3E Generation

                                    FPGAs

                                    For applications requiring large on-chip memories Spartantrade-3

                                    Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                    Using various configuration options Select RAM blocks create RAM ROM

                                    FIFOs large look-up tables data width converterscircular buffers and shift

                                    registers each supporting various data widths and depths This application note

                                    describes the features and capabilities of block Select RAM and illustrates how

                                    to specify the various options using the Xilinx CORE Generatortrade system or

                                    via VHDL or Verilog instantiation

                                    29

                                    Table 2 Block RAM Available in Spartan-3E Devices

                                    Devi

                                    ce

                                    R

                                    A

                                    M

                                    Co

                                    lu

                                    m

                                    ns

                                    R

                                    A

                                    M

                                    Bl

                                    oc

                                    ks

                                    pe

                                    r

                                    C

                                    ol

                                    u

                                    m

                                    n

                                    T

                                    o

                                    t

                                    a

                                    l

                                    R

                                    A

                                    M

                                    B

                                    l

                                    o

                                    c

                                    k

                                    s

                                    T

                                    o

                                    t

                                    a

                                    l

                                    R

                                    A

                                    M

                                    B

                                    it

                                    s

                                    T

                                    o

                                    t

                                    a

                                    l

                                    R

                                    A

                                    M

                                    K

                                    b

                                    i

                                    t

                                    s

                                    XC3

                                    S100

                                    E

                                    1 4 4 7

                                    3

                                    7

                                    2

                                    8

                                    7

                                    2

                                    XC3

                                    S250

                                    E

                                    2 6 1

                                    2

                                    2

                                    2

                                    1

                                    1

                                    8

                                    4

                                    2

                                    1

                                    6

                                    XC3

                                    S500

                                    E

                                    2 10 2

                                    0

                                    3

                                    6

                                    8

                                    6

                                    4

                                    3

                                    6

                                    0

                                    30

                                    0

                                    XC3

                                    S120

                                    0E

                                    2 14 2

                                    8

                                    5

                                    1

                                    6

                                    0

                                    9

                                    6

                                    5

                                    0

                                    4

                                    XC3

                                    S160

                                    0E

                                    2 18 3

                                    6

                                    6

                                    6

                                    3

                                    5

                                    5

                                    2

                                    6

                                    4

                                    8

                                    Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                    which is allocated to data storage and in some memory configurations an

                                    additional 2K bits allocated to parity or additional plus data bits Physically

                                    the block RAM memory has two completely independent access ports labeled

                                    Port A and Port B The structure is fully symmetrical and both ports are

                                    interchangeable and both ports support data read and write operations Each

                                    memory port is synchronous with its own clock clock enable and write

                                    enable Read operations are also synchronous and require a clock edge and

                                    clock enable Though physically a dual-port memory block RAM simulates

                                    single-port memory in an application as shown in Figure 1 Furthermore each

                                    block memory supports multiple configurations or aspect ratios Table 3

                                    summarizes the essential SelectRAM features Cascade multiple block RAMs

                                    to create deeper and wider memory organizations with a minimal timing

                                    penalty incurred through specialized routing resources

                                    31

                                    32

                                    10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                    Why (V) HDL

                                    Interoperability

                                    Technology independence

                                    Design reuse

                                    Several levels of abstraction

                                    Readability

                                    Standard language

                                    Widely supported

                                    What is VHDL

                                    VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                    IC)

                                    Design specification language

                                    Design entry language

                                    Design simulation language

                                    Design documentation language

                                    An alternative to schematics

                                    101 Brief History

                                    VHDL Was developed in the early 1980s for managing design problems that

                                    involved large circuits and multiple teams of engineers

                                    Funded by US Department of Defence

                                    33

                                    The first publicly available version was released in 1985

                                    In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                    presented with a proposal to standardize the VHDL

                                    In 1987 standardization =gt IEEE 1076-1987

                                    An improved version of the language was released in 1994 =gt IEEE

                                    standard1076-1993

                                    102 Related Standards

                                    IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                    impedance

                                    Soon after IEEE 1076-1987 was released simulator companies began using

                                    their own non-standard types =gt VHDL was becoming a nonstandard

                                    IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                    definitions for a nine-valued data type std_logic

                                    IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                    to actual hardware

                                    Defines eg two numeric types signed and unsigned

                                    VHDL Environment

                                    Design Units

                                    34

                                    Segments of VHDL code that can be compiled separately and stored in a

                                    library

                                    Entities

                                    A black box with interface definition

                                    Defines the inputsoutputs of a component (define pins)

                                    A way to represent modularity in VHDL

                                    Similar to symbol in schematic

                                    Entity declaration describes entity

                                    Eg

                                    entity Comparator is

                                    port (A B in std_logic_vector(7 downto0)

                                    EQ out std_logic)

                                    end Comparator

                                    Ports Provide channels of communication between the component and its

                                    environment

                                    Each port must have a name direction and a type

                                    An entity may have NO port declaration

                                    35

                                    Port directions In A value of a port can be read inside the component but cannot be assigned

                                    Multiple reads of port are allowed

                                    Out Assignments can be made to a port but data from a port cannot be read

                                    Multiple assignments are allowed

                                    In out Bi-directional assignments can be made and data can be read Multiple

                                    assignments are allowed

                                    Buffer An out port with read capability May have at most one assignment

                                    (are not recommended)

                                    Architectures Every entity has at least one architecture

                                    One entity can have several architectures

                                    Architectures can describe design using

                                    BehaviorndashStructurendashDataflow

                                    Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                    Transfer Level)ndashBehavioral level

                                    Configuration declaration links architecture to entity

                                    Eg

                                    Architecture Comparator1 of Comparator is

                                    Begin

                                    EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                    End Comparator1

                                    Configurations Links entity declaration and architecture body together

                                    Concept of default configuration is a bit messy in VHDL lsquo87

                                    ndashLast architecture analyzed links to entity

                                    Can be used to change simulation behavior without re-analyzing the VHDL

                                    source

                                    Complex configuration declarations are ignored in synthesis

                                    36

                                    Some entities can have eggate level architecture and behavioral

                                    architecture

                                    Are always optional

                                    PackagesPackages contain information common to many design units

                                    1 Package declaration

                                    --constant declarations

                                    ndashtype and subtype declarations

                                    ndashfunction and procedure declarations

                                    ndashglobal signal declarations

                                    ndashfile declarations

                                    ndashcomponent declarations

                                    2 Package body

                                    ndashis not necessary needed

                                    ndashfunction bodies

                                    ndashprocedure bodies

                                    Packages are meant for encapsuling data which can be shared globally among

                                    Several design units

                                    Consists of declaration part and optional body part

                                    Package declaration can contain

                                    ndashtype and subtype declarations

                                    ndashsubprograms

                                    ndashconstants

                                    Alias declarations

                                    ndashglobal signal declarations

                                    ndashfile declarations

                                    ndashcomponent declarations

                                    37

                                    Package body consists of

                                    ndashsubprogram declarations and bodies

                                    ndashtype and subtype declarations

                                    ndash deferred constants

                                    ndash file declarations

                                    LibrariesCollection of VHDL design units (database)

                                    1 Packages

                                    package declaration

                                    package body

                                    2 Entities (entity declaration)

                                    3 Architectures (architecture body)

                                    4 Configurations (configuration declarations)

                                    Usually directory in UNIX file system

                                    Can be also any other kind of database

                                    Levels of Abstraction

                                    VHDL supports many possible styles of design description which differ

                                    primarily in how closely they relate to the HW

                                    It is possible to describe a circuit in a number of ways

                                    Structural-------

                                    Dataflow ------- Higher level of abstraction

                                    Behavioral -------

                                    Structural VHDL description

                                    Circuit is described in terms of its components

                                    From a low-level description (eg transistor-level description) to a high level

                                    description (eg block diagram)

                                    For large circuits a low-level description quickly becomes impractical

                                    103 Dataflow VHDL Description

                                    38

                                    Circuit is described in terms of how data moves through the system

                                    In the dataflow style you describe how information flows between registers in

                                    the system

                                    The combinational logic is described at a relatively high level the placement

                                    and operation of registers is specified quite precisely

                                    The behavior of the system over the time is defined by registers

                                    There are no build-in registers in VHDL-language

                                    ndashEither lower level description

                                    ndashor behavioral description of sequential elements is needed

                                    The lower level register descriptions must be created or obtained

                                    If there is no 3rd party models for registers =gt you must write the behavioral

                                    description of registers

                                    The behavioral description can be provided in the form of

                                    subprograms(functions or procedures)

                                    104 Behavioral VHDL Description

                                    Circuit is described in terms of its operation over time

                                    39

                                    Representation might include eg state diagrams timing diagrams and

                                    algorithmic descriptions

                                    The concept of time may be expressed precisely using delays (eg A lt= B

                                    after 10 ns)

                                    If no actual delay is used order of sequential operations is defined

                                    In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                    timing specifications

                                    The actual timing results depend on implementation technology and efficiency

                                    of synthesis tool

                                    There are a few tools for behavioral synthesis

                                    Concurrent Vs Sequential

                                    Processes

                                    Basic simulation concept in VHDL

                                    VHDL description can always be broken up to interconnected processes

                                    Quite similar to Unix process

                                    40

                                    Process keyword in VHDL

                                    Process statement is concurrent statement

                                    Statements inside process statements are sequential statements

                                    Process must contain either sensitivity list or wait statement(s) but NOT both

                                    Sensitivity list or wait statement(s) contains signals which wakes process up

                                    General format

                                    Process [(sensitivity_list)]

                                    process_declarative_part

                                    Begin

                                    process_statements

                                    [wait_statement]

                                    End process

                                    41

                                    11 VGA CONTROLLER CODE

                                    library IEEE

                                    use IEEESTD_LOGIC_1164ALL

                                    use IEEESTD_LOGIC_ARITHALL

                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                    use IEEEnumeric_stdALL

                                    ---- Uncomment the following library declaration if instantiating

                                    ---- any Xilinx primitives in this code

                                    --library UNISIM

                                    --use UNISIMVComponentsall

                                    entity vga_controller is port(clk50min std_logic

                                    rstnin std_logic

                                    hsyncout std_logic

                                    vsyncout std_logic

                                    redout std_logic_vector(0 downto 0)

                                    greenout std_logic_vector(0 downto 0)

                                    blueout std_logic_vector(0 downto 0))

                                    end vga_controller

                                    architecture Behavioral of vga_controller is

                                    component memory

                                    port (

                                    addr IN std_logic_VECTOR(17 downto 0)

                                    clk IN std_logic

                                    dout OUT std_logic_VECTOR(0 downto 0)

                                    en IN std_logic

                                    sinit IN std_logic)

                                    END component

                                    42

                                    component sync_unit port(

                                    clkin std_logic

                                    rstnin std_logic

                                    hsyncout std_logic

                                    ---horizontal synch pulse(row)

                                    vsyncout std_logic

                                    --vertical sync pulse(frame) 111110100000000000

                                    v_en_regout std_logic

                                    h_enout std_logic

                                    v_enout std_logic

                                    line_countinout integer

                                    c_hsinout integer)

                                    end component

                                    component pixelgen port(

                                    clkin std_logic

                                    rstnin std_logic

                                    hsyncin std_logic

                                    vsyncin std_logic

                                    h_enin std_logic

                                    v_enin std_logic

                                    v_en_reg in std_logic

                                    data_inin std_logic_vector(0 downto 0)

                                    line_countin integer

                                    c_hsin integer

                                    red out std_logic_vector(0 downto 0)

                                    greenout std_logic_vector(0 downto 0)

                                    blueout std_logic_vector(0 downto 0)

                                    addrout std_logic_vector(17 downto 0)

                                    rdout std_logic)

                                    end component

                                    43

                                    signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                    signal line_count_sc_hs_saddrinteger

                                    signal data_sstd_logic_vector(0 downto 0)

                                    signal addr_sstd_logic_vector(17 downto 0)

                                    begin

                                    --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                    process(clk50mrstn)

                                    begin

                                    if(rstn = 0)then

                                    clk lt= 0

                                    elsif(clk50m = 1 and clk50mevent)then

                                    clk lt= not clk

                                    end if

                                    --end if

                                    end process

                                    hsync lt= hsync_s

                                    vsync lt= vsync_s

                                    sync sync_unit port map(

                                    clk =gt clk

                                    rstn =gt rstn

                                    hsync =gt hsync_s

                                    vsync =gt vsync_s

                                    v_en_reg =gt v_en_reg_s

                                    line_count =gt line_count_s

                                    h_en =gt h_en_s

                                    v_en =gt v_en_s

                                    c_hs =gt c_hs_s

                                    )

                                    44

                                    pixel pixelgen port map(

                                    clk =gt clk

                                    rstn =gt rstn

                                    hsync =gt hsync_s

                                    vsync =gt vsync_s

                                    h_en =gt h_en_s

                                    v_en =gt v_en_s

                                    v_en_reg =gt v_en_reg_s

                                    data_in =gt data_s

                                    line_count=gt line_count_s

                                    c_hs =gt c_hs_s

                                    red =gt red

                                    green =gt green

                                    blue =gt blue

                                    addr =gt addr_s

                                    rd =gt rd_s

                                    )

                                    mem12 memory port map(

                                    addr =gt addr_s

                                    clk =gt clk

                                    dout =gt data_s

                                    en =gt rd_s

                                    sinit=gt rstn

                                    )

                                    end Behavioral

                                    45

                                    SYNCHRONIZATION BLOCK CODE

                                    library IEEE

                                    use IEEESTD_LOGIC_1164ALL

                                    use IEEESTD_LOGIC_ARITHALL

                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                    entity sync_unit is port(

                                    clkin std_logic

                                    rstnin std_logic

                                    hsyncout std_logic ---horizontal synch pulse(row)

                                    vsyncout std_logic --vertical sync

                                    pulse(frame) 111110100000000000

                                    v_en_regout std_logic

                                    h_enout std_logic

                                    v_enout std_logic

                                    line_countout integer

                                    c_hsout integer )

                                    end sync_unit

                                    architecture arch_sync_unit of sync_unit is

                                    signal h_en_sv_en_sstd_logic

                                    signal line_count_sc_hs_sinteger

                                    begin

                                    c_hs lt= c_hs_s

                                    line_count lt= line_count_s

                                    h_en lt= h_en_s

                                    v_en lt= v_en_s

                                    -- hsync counter

                                    process(clkrstn)

                                    begin

                                    if(rstn = 0)then

                                    46

                                    c_hs_s lt= 0

                                    else

                                    if(clk = 1 and clkevent)then

                                    if(c_hs_s=793)then

                                    c_hs_s lt= 0

                                    else

                                    c_hs_s lt= c_hs_s +1

                                    end if

                                    end if

                                    end if

                                    end process

                                    --vsync counter

                                    --vertical line counter

                                    process(h_en_srstn)

                                    begin

                                    if(rstn = 0)then

                                    line_count_s lt= 0

                                    elsif(h_en_s = 1 and h_en_sevent)then

                                    line_count_s lt= line_count_s + 1

                                    if (line_count_s=515)then

                                    line_count_s lt= 0

                                    end if

                                    end if

                                    --end if

                                    end process

                                    --hysnc pulse

                                    process(clkrstn)

                                    begin

                                    if(rstn = 0)then

                                    47

                                    hsync lt= 1

                                    elsif(clk = 1 and clkevent)then

                                    if(c_hs_s lt=95)then

                                    hsync lt= 0

                                    else

                                    hsync lt= 1

                                    end if

                                    end if

                                    --end if

                                    end process

                                    process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                    this period)

                                    begin

                                    if(rstn = 0)then

                                    h_en_s lt= 0

                                    elsif(clk = 1 and clkevent)then

                                    if((c_hs_s gt=143) and (c_hs_slt=778))then

                                    h_en_s lt= 1

                                    else

                                    h_en_s lt= 0

                                    end if

                                    end if

                                    --end if

                                    end process

                                    ---vysnc pulse

                                    process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                    begin

                                    if(rstn = 0)then

                                    48

                                    v_en_s lt= 0

                                    elsif(clk = 1 and clkevent)then

                                    if((line_count_sgt=35 and line_count_slt=515))then

                                    v_en_s lt= 1

                                    else

                                    v_en_s lt= 0

                                    end if

                                    end if

                                    --end if

                                    end process

                                    process(clkrstn) --Vertical Sync Front porch Back Porch

                                    begin

                                    if(rstn = 0)then

                                    vsync lt= 1

                                    elsif(clk = 1 and clkevent)then

                                    if(line_count_s lt2)then

                                    vsync lt= 0

                                    else

                                    vsync lt= 1

                                    end if

                                    end if

                                    --end if

                                    end process

                                    process(clk)

                                    begin

                                    if(clk = 1 and clkevent)then

                                    v_en_reglt=v_en_s

                                    end if

                                    end process

                                    end arch_sync_unit

                                    49

                                    PIXEL GENERATOR

                                    library IEEE

                                    use IEEESTD_LOGIC_1164ALL

                                    use IEEESTD_LOGIC_ARITHALL

                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                    entity pixelgen is port(

                                    clkin std_logic

                                    rstnin std_logic

                                    hsyncin std_logic

                                    vsyncin std_logic

                                    h_enin std_logic

                                    v_enin std_logic

                                    v_en_reg in std_logic

                                    data_inin std_logic_vector(0 downto 0)

                                    line_countin integer

                                    c_hsin integer

                                    red out std_logic_vector(0 downto 0)

                                    greenout std_logic_vector(0 downto 0)

                                    blueout std_logic_vector(0 downto 0)

                                    addrout std_logic_vector(17 downto 0)

                                    rdout std_logic

                                    )

                                    end pixelgen

                                    architecture behv of pixelgen is

                                    signal addr_sstd_logic_vector(17 downto 0)

                                    begin

                                    addr lt= addr_s

                                    process(clkrstn)

                                    begin

                                    50

                                    if(rstn = 0)then

                                    rd lt= 0

                                    red lt= 0

                                    green lt= 0

                                    blue lt= 0

                                    addr_s lt= x0000 amp 00

                                    elsif (clk = 1 and clkevent)then

                                    if((v_en=1) and (v_en_reg=0))then

                                    addr_slt=(others =gt0)

                                    end if

                                    if(v_en=1 and h_en=1)then

                                    if(line_countlt=514)then

                                    if(c_hs lt= 655)then

                                    rd lt= 1

                                    red lt= data_in

                                    green lt= data_in

                                    blue lt= data_in

                                    addr_s lt=addr_s +000000000000000001

                                    else

                                    rd lt= 0

                                    red lt= 0

                                    green lt= 1

                                    blue lt= 0

                                    end if

                                    else

                                    rd lt= 0

                                    red lt= 1

                                    green lt= 1

                                    blue lt= 1

                                    51

                                    end if

                                    if(addr_s=111011111111111111)then

                                    addr_slt= (others =gt0)

                                    end if

                                    else

                                    red lt= 0

                                    green lt= 0

                                    blue lt= 0

                                    end if

                                    end if

                                    -- end if

                                    --end if

                                    end process

                                    end behv

                                    52

                                    12 FPGA ARCHITECTURE

                                    A field-programmable gate array (FPGA) is a semiconductor device that can

                                    be configured by the customer or designer after manufacturingmdashhence the

                                    name field-programmable FPGAs are programmed using a logic circuit

                                    diagram or a source code in a hardware description language (HDL) to specify

                                    how the chip will work They can be used to implement any logical function

                                    that an application-specific integrated circuit (ASIC) could perform but the

                                    ability to update the functionality after shipping offers advantages for many

                                    applications

                                    FPGAs contain programmable logic components called logic blocks and a

                                    hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                    togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                    can be configured to perform complex combinational functions or merely

                                    simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                    include memory elements which may be simple flip-flops or more complete

                                    blocks of memory

                                    History The FPGA industry sprouted from programmable read only

                                    memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                    both had the option of being programmed in batches in a factory or in the field

                                    (field programmable) however programmable logic was hard-wired between

                                    logic gates

                                    Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                    first commercially viable field programmable gate array in 1985 ndash the

                                    XC2064 The XC2064 had programmable gates and programmable

                                    interconnects between gates the beginnings of a new technology and market

                                    The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                    input lookup tables (LUTs) More than 20 years later Freeman was entered

                                    into the National Inventors Hall of Fame for his invention

                                    53

                                    Some of the industryrsquos foundational concepts and technologies for

                                    programmable logic arrays gates and logic blocks are founded in patents

                                    awarded to David W Page and LuVerne R Peterson in 1985

                                    In the late 1980s the Naval Surface Warfare Department funded an experiment

                                    proposed by Steve Casselman to develop a computer that would implement

                                    600000 reprogrammable gates Casselman was successful and the system was

                                    awarded a patent in 1992

                                    Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                    1990s when competitors sprouted up eroding significant market-share By

                                    1993 Actel was serving about 18 percent of the market

                                    The 1990s were an explosive period of time for FPGAs both in sophistication

                                    and the volume of production In the early 1990s FPGAs were primarily used

                                    in telecommunications and networking By the end of the decade FPGAs

                                    found their way into consumer automotive and industrial applications

                                    FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                    algorithm technology and FPGAs to create a sound recognition device

                                    Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                    to decide the configuration needed to accomplish a sound recognition task

                                    121 Modern developments

                                    A recent trend has been to take the coarse-grained architectural

                                    approach a step further by combining the logic blocks and interconnects of

                                    traditional FPGAs with embedded microprocessors and related peripherals to

                                    form a complete system on a programmable chip This work mirrors the

                                    architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                    Group which combined a reconfigurable CPU architecture on a single chip

                                    called the SB24 That work was done in 1982 Examples of such hybrid

                                    technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                    which include one or more PowerPC processors embedded within the FPGAs

                                    54

                                    logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                    processor in combination with Atmels programmable logic architecture

                                    An alternate approach to using hard-macro processors is to make use of soft

                                    processor cores that are implemented within the FPGA logic (See Soft

                                    processors below)

                                    As previously mentioned many modern FPGAs have the ability to be

                                    reprogrammed at run time and this is leading to the idea of reconfigurable

                                    computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                    suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                    example of a reconfigurable soft processor implemented on FPGAs However

                                    it does not support dynamic reconfiguration at runtime but instead adapts itself

                                    to a specific program

                                    Additionally new non-FPGA architectures are beginning to emerge

                                    Software-configurable microprocessors such as the Stretch S5000 adopt a

                                    hybrid approach by providing an array of processor cores and FPGA-like

                                    programmable cores on the same chip

                                    Gates

                                    1987 9000 gates Xilinx

                                    1992 600000 Naval Surface Warfare Department

                                    Early 2000s Millions

                                    Market size

                                    1985 First commercial FPGA technology invented by Xilinx

                                    1987 $14 million

                                    ~1993 gt$385 million

                                    2005 $19 billion

                                    2010 estimates $275 billion

                                    55

                                    122 FPGA Comparisons

                                    Historically FPGAs have been slower less energy efficient and generally

                                    achieved less functionality than their fixed ASIC counterparts A combination

                                    of volume fabrication improvements research and development and the IO

                                    capabilities of new supercomputers have largely closed the performance gap

                                    between ASICs and FPGAs

                                    Advantages include a shorter time to market ability to re-program in the field

                                    to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                    middle road by developing their hardware on ordinary FPGAs but

                                    manufacture their final version so it can no longer be modified after the design

                                    has been committed

                                    Xilinx claims that several market and technology dynamics are changing the

                                    ASICFPGA paradigm

                                    IC costs are rising aggressively

                                    ASIC complexity has bolstered development time and costs

                                    RampD resources and headcount is decreasing

                                    Revenue losses for slow time-to-market are increasing

                                    Financial constraints in a poor economy are driving low-cost technologies

                                    These trends make FPGAs a better alternative than ASICs for a growing

                                    number of higher-volume applications than they have been historically used

                                    for which the company blames for the growing number of FPGA design starts

                                    (see History)

                                    The primary differences between CPLDs and FPGAs are architectural A

                                    CPLD has a somewhat restrictive structure consisting of one or more

                                    programmable sum-of-products logic arrays feeding a relatively small number

                                    of clocked registers The result of this is less flexibility with the advantage of

                                    more predictable timing delays and a higher logic-to-interconnect ratio The

                                    FPGA architectures on the other hand are dominated by interconnect This

                                    56

                                    makes them far more flexible (in terms of the range of designs that are

                                    practical for implementation within them) but also far more complex to design

                                    for

                                    Another notable difference between CPLDs and FPGAs is the presence in most

                                    FPGAs of higher-level embedded functions (such as adders and multipliers)

                                    and embedded memories as well as to have logic blocks implement decoders

                                    or mathematical functions

                                    Some FPGAs have the capability of partial re-configuration that lets one

                                    portion of the device be re-programmed while other portions continue running

                                    123 Applications

                                    Applications of FPGAs include digital signal processing software-defined

                                    radio aerospace and defense systems ASIC prototyping medical imaging

                                    computer vision speech recognition cryptography bioinformatics computer

                                    hardware emulation radio astronomy and a growing range of other areas

                                    FPGAs originally began as competitors to CPLDs and competed in a similar

                                    space that of glue logic for PCBs As their size capabilities and speed

                                    increased they began to take over larger and larger functions to the state where

                                    some are now marketed as full systems on chips (SoC) Particularly with the

                                    introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                    applications which had traditionally been the sole reserve of DSPs began to

                                    incorporate FPGAs instead

                                    FPGAs especially find applications in any area or algorithm that can make use

                                    of the massive parallelism offered by their architecture One such area is code

                                    breaking in particular brute-force attack of cryptographic algorithms

                                    FPGAs are increasingly used in conventional high performance computing

                                    applications where computational kernels such as FFT or Convolution are

                                    performed on the FPGA instead of a microprocessor

                                    57

                                    The inherent parallelism of the logic resources on an FPGA allows for

                                    considerable computational throughput even at a low MHz clock rates The

                                    flexibility of the FPGA allows for even higher performance by trading off

                                    precision and range in the number format for an increased number of parallel

                                    arithmetic units This has driven a new type of processing called

                                    reconfigurable computing where time intensive tasks are offloaded from

                                    software to FPGAs

                                    The adoption of FPGAs in high performance computing is currently limited by

                                    the complexity of FPGA design compared to conventional software and the

                                    extremely long turn-around times of current design tools where 4-8 hours wait

                                    is necessary after even minor changes to the source code

                                    Traditionally FPGAs have been reserved for specific vertical applications

                                    where the volume of production is small For these low-volume applications

                                    the premium that companies pay in hardware costs per unit for a

                                    programmable chip is more affordable than the development resources spent

                                    on creating an ASIC for a low-volume application Today new cost and

                                    performance dynamics have broadened the range of viable applications

                                    Architecture

                                    The most common FPGA architecture consists of an array of configurable

                                    logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                    channels have the same width (number of wires) Multiple IO pads may fit

                                    into the height of one row or the width of one column in the array

                                    An application circuit must be mapped into an FPGA with adequate resources

                                    While the number of CLBs and IOs required is easily determined from the

                                    design the number of routing tracks needed may vary considerably even

                                    among designs with the same amount of logic (For example a crossbar switch

                                    requires much more routing than a systolic array with the same gate count)

                                    Since unused routing tracks increase the cost (and decrease the performance)

                                    of the part without providing any benefit FPGA manufacturers try to provide

                                    58

                                    just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                    can be routed This is determined by estimates such as those derived from

                                    Rents rule or by experiments with existing designs

                                    The FPGA is an array or island-style FPGA It consists of an array of logic

                                    blocks and routing channels Two IO pads fit into the height of one row or the

                                    width of one column as shown below All the routing channels have the same

                                    width (number of wires)

                                    121 FPGA structure

                                    A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                    flip-flop as shown below In recent years manufacturers have started moving

                                    to 6-input LUTs in their high performance parts claiming increased

                                    performance

                                    Typical logic block

                                    59

                                    There is only one output which can be either the registered or the unregistered

                                    LUT output The logic block has four inputs for the LUT and a clock input

                                    Since clock signals (and often other high-fanout signals) are normally routed

                                    via special-purpose dedicated routing networks in commercial FPGAs they

                                    and other signals are separately managed

                                    For this example architecture the locations of the FPGA logic block pins are

                                    shown below

                                    Logic Block Pin Locations

                                    Each input is accessible from one side of the logic block while the output pin

                                    can connect to routing wires in both the channel to the right and the channel

                                    below the logic block

                                    Each logic block output pin can connect to any of the wiring segments in the

                                    channels adjacent to it

                                    Similarly an IO pad can connect to any one of the wiring segments in the

                                    channel adjacent to it For example an IO pad at the top of the chip can

                                    connect to any of the W wires (where W is the channel width) in the horizontal

                                    channel immediately below it

                                    Generally the FPGA routing is unsegmented That is each wiring segment

                                    spans only one logic block before it terminates in a switch box By turning on

                                    60

                                    some of the programmable switches within a switch box longer paths can be

                                    constructed For higher speed interconnect some FPGA architectures use

                                    longer routing lines that span multiple logic blocks

                                    Whenever a vertical and a horizontal channel intersect there is a switch box In

                                    this architecture when a wire enters a switch box there are three

                                    programmable switches that allow it to connect to three other wires in adjacent

                                    channel segments The pattern or topology of switches used in this

                                    architecture is the planar or domain-based switch box topology In this switch

                                    box topology a wire in track number one connects only to wires in track

                                    number one in adjacent channel segments wires in track number 2 connect

                                    only to other wires in track number 2 and so on The figure below illustrates

                                    the connections in a switch box

                                    Switch box topology

                                    Modern FPGA families expand upon the above capabilities to include higher

                                    level functionality fixed into the silicon Having these common functions

                                    embedded into the silicon reduces the area required and gives those functions

                                    increased speed compared to building them from primitives Examples of these

                                    include multipliers generic DSP blocks embedded processors high speed IO

                                    logic and embedded memories

                                    61

                                    FPGAs are also widely used for systems validation including pre-silicon

                                    validation post-silicon validation and firmware development This allows chip

                                    companies to validate their design before the chip is produced in the factory

                                    reducing the time to market

                                    124 FPGA Design and Programming

                                    To define the behavior of the FPGA the user provides a hardware description

                                    language (HDL) or a schematic design The HDL form might be easier to work

                                    with when handling large structures because its possible to just specify them

                                    numerically rather than having to draw every piece by hand On the other hand

                                    schematic entry can allow for easier visualisation of a design

                                    Then using an electronic design automation tool a technology-mapped netlist

                                    is generated The netlist can then be fitted to the actual FPGA architecture

                                    using a process called place-and-route usually performed by the FPGA

                                    companys proprietary place-and-route software The user will validate the

                                    map place and route results via timing analysis simulation and other

                                    verification methodologies Once the design and validation process is

                                    complete the binary file generated (also using the FPGA companys

                                    proprietary software) is used to (re)configure the FPGA

                                    Going from schematicHDL source files to actual configuration The source

                                    files are fed to a software suite from the FPGACPLD vendor that through

                                    different steps will produce a file This file is then transferred to the

                                    FPGACPLD via a serial interface (JTAG) or to an external memory device

                                    like an EEPROM

                                    The most common HDLs are VHDL and Verilog although in an attempt to

                                    reduce the complexity of designing in HDLs which have been compared to the

                                    equivalent of assembly languages there are moves to raise the abstraction level

                                    through the introduction of alternative languages

                                    62

                                    To simplify the design of complex systems in FPGAs there exist libraries of

                                    predefined complex functions and circuits that have been tested and optimized

                                    to speed up the design process These predefined circuits are commonly called

                                    IP cores and are available from FPGA vendors and third-party IP suppliers

                                    (rarely free and typically released under proprietary licenses) Other

                                    predefined circuits are available from developer communities such as Open

                                    Cores (typically free and released under the GPL BSD or similar license) and

                                    other sources

                                    In a typical design flow an FPGA application developer will simulate the

                                    design at multiple stages throughout the design process Initially the RTL

                                    description in VHDL or Verilog is simulated by creating test benches to

                                    simulate the system and observe results Then after the synthesis engine has

                                    mapped the design to a netlist the netlist is translated to a gate level

                                    description where simulation is repeated to confirm the synthesis proceeded

                                    without errors Finally the design is laid out in the FPGA at which point

                                    propagation delays can be added and the simulation run again with these

                                    values back-annotated onto the netlist

                                    Basic Process Technology Types

                                    SRAM - based on static memory technology In-system programmable and re-

                                    programmable Requires external boot devices CMOS

                                    Antifuse - One-time programmable CMOS

                                    EPROM - Erasable Programmable Read-Only Memory technology Usually

                                    one-time programmable in production because of plastic packaging

                                    Windowed devices can be erased with ultraviolet (UV) light CMOS

                                    EEPROM - Electrically Erasable Programmable Read-Only Memory

                                    technology Can be erased even in plastic packages Some but not all

                                    EEPROM devices can be in-system programmed CMOS

                                    Flash - Flash-erase EPROM technology Can be erased even in plastic

                                    packages Some but not all flash devices can be in-system programmed

                                    63

                                    Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                    therefore less expensive to manufacture CMOS

                                    Fuse - One-time programmable Bipolar

                                    Major Manufacturers

                                    Xilinx and Altera are the current FPGA market leaders and long-time industry

                                    rivals Together they control over 80 percent of the market with Xilinx alone

                                    representing over 50 percent

                                    Xilinx also provides free Windows and Linux design software while Altera

                                    provides free Windows tools the Solaris and Linux tools are only available via

                                    a rental scheme

                                    Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                    (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                    Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                    focused CSSP no general purpose FPGAs)

                                    125 FPGA prototype

                                    FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                    prototyping is the method to prototype SoC and ASIC design on FPGA for

                                    hardware verification and early software development

                                    Main stream verification methods for hardware design and early software and

                                    firmware co-design has become mainstream Prototyping SoC and ASIC

                                    design on FPGA has become a good method to do this

                                    64

                                    Reasons why Prototyping is important

                                    1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                    functionally correct This is compared to designers only relying on software

                                    simulations to verify that their hardware design is sound Simulation speed and

                                    modeling accuracy limitations hinder this development

                                    2 Due to time constrains many projects cannot wait until the silicon is back from

                                    the foundry to start on software tests FPGA prototyping allows for much more

                                    time in area of software development and testing at the software-hardware

                                    integration stage This allows many unforeseen software bugs that appear due

                                    to todays array of operating systems applications and hardware

                                    3 Prototyping also allows the developer to ensure that all IP technologies on his

                                    system work well together off the simulation stage and in actual form

                                    4 Prototyping has the added advantage as demo platforms to SoC clients

                                    bringing in interest early This speeds up the overall development cycle and

                                    allows for more enhancement or improvement to the chip features as it would

                                    otherwise have been

                                    65

                                    13 SIMULATION RESULTS

                                    66

                                    GATE LEVEL

                                    FigVGA CONTROLLER

                                    67

                                    GATE LEVEL

                                    Fig SYNC UNIT

                                    68

                                    TECHNOLOGY SCHEMATIC

                                    69

                                    14 APPLICATIONS

                                    Motion Tracking Systems

                                    Image processing systems

                                    Displaying Systems

                                    141 CONCLUTION

                                    Hence by implementing FPGA taking a processor-generated picture

                                    (pixilated) from memory space the Controller provides digital RGB values for

                                    each pixel as well as horizontal and vertical synchronization signals in order

                                    to correctly display the picture on a connected monitor

                                    70

                                    REFERENCES

                                    1 httpenwikipediaorgwikiVideo_Graphics_Array

                                    2 A VGA display controller by Eduardo Sanchez

                                    3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                    71

                                    • Pixel
                                      • 6 SIGNALS IN VGA CONTROLLER
                                      • Gates
                                      • Market size
                                      • 122 FPGA Comparisons

                                      the internal line enable signal en generated by the Synchronization Unit of the Controller

                                      DATA I 8 Data input from picture memory space Data is stored in memory in bytes the content of which depends on the chosen color palette as follows Black amp White mode 8 1-bit pixels 16 Colors mode 2 4-bit pixels 64 Colors mode 1 6-bit pixel (DATA[50]) Note In 64 Colors mode bits 7 and 6 of each byte are not used

                                      ADDR_PIXEL O 19 Specifies the address of the next pixel in picture memory Addresses are consecutive ndash once the end of the current line has been reached the next address is that of the pixel at the start of the next line down in the frame

                                      VGA Monitor Control Signals

                                      HSYNC O Falling Horizontal synchronization signal This signal is used to control the horizontal deflection circuit in the

                                      VGA monitor so that the start and end of a line of pixels is correctly displayed across the visible display area of the screen The horizontal size of the display area is controlled by the DISPSIZE_H input to the Controller

                                      19

                                      VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                                      R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                      G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                      S

                                      B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                      20

                                      9 VGA SIGNAL TIMING

                                      Figure 3 summarizes the signal timing involved in sending a line of

                                      pixels and a frame of lines The actual time values differ according to the

                                      resolution selected (640x480 or 800x600) the processor-defined values for

                                      DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                                      (CLK)

                                      21

                                      91 VGA PIXEL UNIT

                                      The Pixel Unit provides access to the pixilated image stored in external

                                      picture memory reading in data a byte at a time and formatting each pixel to

                                      be displayed For each pixel the 6-bit RGB value required for the control of

                                      the monitorrsquos three primary color electron guns is generated so that the pixel

                                      is displayed on the screen with the correct color

                                      Accessing the stored image

                                      The image to be displayed on the monitor screen is written by the host

                                      microcontroller into external memory space (RAM) This memory space can

                                      be located anywhere (eg using a block of RAM within the FPGA design or

                                      using the SRAM on the NanoBoard itself)

                                      Pixel data is stored in the memory space in bytes The number of pixels in a

                                      byte depends on which color palette is being used for the image (selected by

                                      the CMOD input)

                                      bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                                      bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                                      bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                                      (DATA[76] are not used)

                                      22

                                      The size of memory required to store a picture is determined by the total

                                      number of viewable pixels in a line (determined by DISPSIZE_H) the total

                                      number of viewable lines in a frame (determined by DISPSIZE_V) and the

                                      number of pixels stored in each byte in memory space

                                      Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                                      The address in RAM where the next pixel is stored is determined using

                                      an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                                      reflects the current contents of the MEMADDR register The exact addressing

                                      is described below

                                      92 Black amp White mode

                                      The picture memory address ndash the byte of data containing the next 8 pixels

                                      ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                                      three

                                      ADDR_PIXEL = 00 amp hvcnt[183]

                                      16 Colors mode

                                      The picture memory address ndash the byte of data containing the next 2 pixels

                                      ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                                      one

                                      ADDR_PIXEL = 0 amp hvcnt[181]

                                      64 Colors mode

                                      The picture memory address ndash the byte of data containing the next pixel ndash is

                                      determined by using the full value of hvcnt

                                      ADDR_PIXEL = hvcnt[180]

                                      The Pixel register (PIXREG)

                                      The Pixel register is used to receive the byte of pixel data read from the

                                      current address in memory space The register PIXREG is reset to zero when

                                      the VGA Controller receives an external reset signal (RST)

                                      23

                                      The register is updated on each rising edge of the CLK signal Data can be

                                      read from the memory space as long as the RD signal is active (High) The RD

                                      signal is itself controlled by the external line display enable signal enex This

                                      internally generated signal is defined as follows

                                      If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                                      (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                                      Else

                                      enex = 1

                                      When enex is Low read access from memory is disabled (RD = 0)

                                      The point at which data is loaded from memory into PIXREG depends on the

                                      particular color palette that is chosen ndash Black amp White 16 Colors or 64

                                      Colors

                                      Black amp White mode

                                      The next byte of data will be loaded into the register whenever the

                                      lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                                      are ldquo000rdquo

                                      For the currently loaded byte the active pixel is always in the lowest bit

                                      position of the Pixel register Each pixel in the data byte is moved into this

                                      active pixel position by shifting the contents of the register right by one bit on

                                      each rising edge of CLK

                                      16 Colors mode

                                      The next byte of data will be loaded into the register whenever the

                                      lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                                      0

                                      For the currently loaded byte the active pixel is always in the low order

                                      nibble of the Pixel register Remember that in this mode each byte of data

                                      contains two pixels The second pixel is moved into this active pixel position

                                      by shifting the contents of the register right by four bits on the rising edge of

                                      CLK

                                      24

                                      64 Colors mode

                                      The next byte of data will be loaded into the register on the rising edge

                                      of the external system clock signal (CLK) In this mode the read of pixel data

                                      does not depend on the status of the hvcnt signal received from the

                                      Synchronization Unit

                                      93 The RGB register (RGBREG)

                                      The RGB register is used to store the six bits that are required for

                                      driving the red green and blue color guns of the target monitor When the

                                      chosen color palette is either Black amp White or 16 Colors these six bits are

                                      obtained by mapping the value of the active pixel to a predefined RGB code

                                      When the chosen color palette is 64 Colors the actual pixel value is used

                                      directly

                                      The register RGBREG is reset to zero (000000) when the VGA

                                      Controller receives an external reset signal (RST) This RGB code represents

                                      black

                                      The register is updated on each rising edge of the CLK signal and the

                                      value loaded is dependent on the state of the line display enable signal en

                                      When en is Low blanking is required and RGBREG is loaded with the code

                                      for black (000000)

                                      Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                      and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                      25

                                      26

                                      27

                                      The RGB color code stored in the RGB register is output from the VGA

                                      Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                      and B1)

                                      The monitor itself expects analog signals as inputs to its electron gun control

                                      circuits This is achieved by using 2-bit digital to analog converter circuitry

                                      located on the Nano Board itself as shown in Figure 4

                                      Figure 91 Digital to Analog RGB conversion

                                      28

                                      For each color the 2-bit digital signal from the VGA Controller can be

                                      converted into 4 distinct analog levels These levels specify the intensity of

                                      each of the three primary colors to use when displaying the pixel on the

                                      monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                      brightness)

                                      With each analog input being one of four possible levels the monitor

                                      can display each pixel on the screen with one of 64 different color

                                      permutations

                                      Using Block RAM in Spartan-3E Generation

                                      FPGAs

                                      For applications requiring large on-chip memories Spartantrade-3

                                      Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                      Using various configuration options Select RAM blocks create RAM ROM

                                      FIFOs large look-up tables data width converterscircular buffers and shift

                                      registers each supporting various data widths and depths This application note

                                      describes the features and capabilities of block Select RAM and illustrates how

                                      to specify the various options using the Xilinx CORE Generatortrade system or

                                      via VHDL or Verilog instantiation

                                      29

                                      Table 2 Block RAM Available in Spartan-3E Devices

                                      Devi

                                      ce

                                      R

                                      A

                                      M

                                      Co

                                      lu

                                      m

                                      ns

                                      R

                                      A

                                      M

                                      Bl

                                      oc

                                      ks

                                      pe

                                      r

                                      C

                                      ol

                                      u

                                      m

                                      n

                                      T

                                      o

                                      t

                                      a

                                      l

                                      R

                                      A

                                      M

                                      B

                                      l

                                      o

                                      c

                                      k

                                      s

                                      T

                                      o

                                      t

                                      a

                                      l

                                      R

                                      A

                                      M

                                      B

                                      it

                                      s

                                      T

                                      o

                                      t

                                      a

                                      l

                                      R

                                      A

                                      M

                                      K

                                      b

                                      i

                                      t

                                      s

                                      XC3

                                      S100

                                      E

                                      1 4 4 7

                                      3

                                      7

                                      2

                                      8

                                      7

                                      2

                                      XC3

                                      S250

                                      E

                                      2 6 1

                                      2

                                      2

                                      2

                                      1

                                      1

                                      8

                                      4

                                      2

                                      1

                                      6

                                      XC3

                                      S500

                                      E

                                      2 10 2

                                      0

                                      3

                                      6

                                      8

                                      6

                                      4

                                      3

                                      6

                                      0

                                      30

                                      0

                                      XC3

                                      S120

                                      0E

                                      2 14 2

                                      8

                                      5

                                      1

                                      6

                                      0

                                      9

                                      6

                                      5

                                      0

                                      4

                                      XC3

                                      S160

                                      0E

                                      2 18 3

                                      6

                                      6

                                      6

                                      3

                                      5

                                      5

                                      2

                                      6

                                      4

                                      8

                                      Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                      which is allocated to data storage and in some memory configurations an

                                      additional 2K bits allocated to parity or additional plus data bits Physically

                                      the block RAM memory has two completely independent access ports labeled

                                      Port A and Port B The structure is fully symmetrical and both ports are

                                      interchangeable and both ports support data read and write operations Each

                                      memory port is synchronous with its own clock clock enable and write

                                      enable Read operations are also synchronous and require a clock edge and

                                      clock enable Though physically a dual-port memory block RAM simulates

                                      single-port memory in an application as shown in Figure 1 Furthermore each

                                      block memory supports multiple configurations or aspect ratios Table 3

                                      summarizes the essential SelectRAM features Cascade multiple block RAMs

                                      to create deeper and wider memory organizations with a minimal timing

                                      penalty incurred through specialized routing resources

                                      31

                                      32

                                      10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                      Why (V) HDL

                                      Interoperability

                                      Technology independence

                                      Design reuse

                                      Several levels of abstraction

                                      Readability

                                      Standard language

                                      Widely supported

                                      What is VHDL

                                      VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                      IC)

                                      Design specification language

                                      Design entry language

                                      Design simulation language

                                      Design documentation language

                                      An alternative to schematics

                                      101 Brief History

                                      VHDL Was developed in the early 1980s for managing design problems that

                                      involved large circuits and multiple teams of engineers

                                      Funded by US Department of Defence

                                      33

                                      The first publicly available version was released in 1985

                                      In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                      presented with a proposal to standardize the VHDL

                                      In 1987 standardization =gt IEEE 1076-1987

                                      An improved version of the language was released in 1994 =gt IEEE

                                      standard1076-1993

                                      102 Related Standards

                                      IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                      impedance

                                      Soon after IEEE 1076-1987 was released simulator companies began using

                                      their own non-standard types =gt VHDL was becoming a nonstandard

                                      IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                      definitions for a nine-valued data type std_logic

                                      IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                      to actual hardware

                                      Defines eg two numeric types signed and unsigned

                                      VHDL Environment

                                      Design Units

                                      34

                                      Segments of VHDL code that can be compiled separately and stored in a

                                      library

                                      Entities

                                      A black box with interface definition

                                      Defines the inputsoutputs of a component (define pins)

                                      A way to represent modularity in VHDL

                                      Similar to symbol in schematic

                                      Entity declaration describes entity

                                      Eg

                                      entity Comparator is

                                      port (A B in std_logic_vector(7 downto0)

                                      EQ out std_logic)

                                      end Comparator

                                      Ports Provide channels of communication between the component and its

                                      environment

                                      Each port must have a name direction and a type

                                      An entity may have NO port declaration

                                      35

                                      Port directions In A value of a port can be read inside the component but cannot be assigned

                                      Multiple reads of port are allowed

                                      Out Assignments can be made to a port but data from a port cannot be read

                                      Multiple assignments are allowed

                                      In out Bi-directional assignments can be made and data can be read Multiple

                                      assignments are allowed

                                      Buffer An out port with read capability May have at most one assignment

                                      (are not recommended)

                                      Architectures Every entity has at least one architecture

                                      One entity can have several architectures

                                      Architectures can describe design using

                                      BehaviorndashStructurendashDataflow

                                      Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                      Transfer Level)ndashBehavioral level

                                      Configuration declaration links architecture to entity

                                      Eg

                                      Architecture Comparator1 of Comparator is

                                      Begin

                                      EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                      End Comparator1

                                      Configurations Links entity declaration and architecture body together

                                      Concept of default configuration is a bit messy in VHDL lsquo87

                                      ndashLast architecture analyzed links to entity

                                      Can be used to change simulation behavior without re-analyzing the VHDL

                                      source

                                      Complex configuration declarations are ignored in synthesis

                                      36

                                      Some entities can have eggate level architecture and behavioral

                                      architecture

                                      Are always optional

                                      PackagesPackages contain information common to many design units

                                      1 Package declaration

                                      --constant declarations

                                      ndashtype and subtype declarations

                                      ndashfunction and procedure declarations

                                      ndashglobal signal declarations

                                      ndashfile declarations

                                      ndashcomponent declarations

                                      2 Package body

                                      ndashis not necessary needed

                                      ndashfunction bodies

                                      ndashprocedure bodies

                                      Packages are meant for encapsuling data which can be shared globally among

                                      Several design units

                                      Consists of declaration part and optional body part

                                      Package declaration can contain

                                      ndashtype and subtype declarations

                                      ndashsubprograms

                                      ndashconstants

                                      Alias declarations

                                      ndashglobal signal declarations

                                      ndashfile declarations

                                      ndashcomponent declarations

                                      37

                                      Package body consists of

                                      ndashsubprogram declarations and bodies

                                      ndashtype and subtype declarations

                                      ndash deferred constants

                                      ndash file declarations

                                      LibrariesCollection of VHDL design units (database)

                                      1 Packages

                                      package declaration

                                      package body

                                      2 Entities (entity declaration)

                                      3 Architectures (architecture body)

                                      4 Configurations (configuration declarations)

                                      Usually directory in UNIX file system

                                      Can be also any other kind of database

                                      Levels of Abstraction

                                      VHDL supports many possible styles of design description which differ

                                      primarily in how closely they relate to the HW

                                      It is possible to describe a circuit in a number of ways

                                      Structural-------

                                      Dataflow ------- Higher level of abstraction

                                      Behavioral -------

                                      Structural VHDL description

                                      Circuit is described in terms of its components

                                      From a low-level description (eg transistor-level description) to a high level

                                      description (eg block diagram)

                                      For large circuits a low-level description quickly becomes impractical

                                      103 Dataflow VHDL Description

                                      38

                                      Circuit is described in terms of how data moves through the system

                                      In the dataflow style you describe how information flows between registers in

                                      the system

                                      The combinational logic is described at a relatively high level the placement

                                      and operation of registers is specified quite precisely

                                      The behavior of the system over the time is defined by registers

                                      There are no build-in registers in VHDL-language

                                      ndashEither lower level description

                                      ndashor behavioral description of sequential elements is needed

                                      The lower level register descriptions must be created or obtained

                                      If there is no 3rd party models for registers =gt you must write the behavioral

                                      description of registers

                                      The behavioral description can be provided in the form of

                                      subprograms(functions or procedures)

                                      104 Behavioral VHDL Description

                                      Circuit is described in terms of its operation over time

                                      39

                                      Representation might include eg state diagrams timing diagrams and

                                      algorithmic descriptions

                                      The concept of time may be expressed precisely using delays (eg A lt= B

                                      after 10 ns)

                                      If no actual delay is used order of sequential operations is defined

                                      In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                      timing specifications

                                      The actual timing results depend on implementation technology and efficiency

                                      of synthesis tool

                                      There are a few tools for behavioral synthesis

                                      Concurrent Vs Sequential

                                      Processes

                                      Basic simulation concept in VHDL

                                      VHDL description can always be broken up to interconnected processes

                                      Quite similar to Unix process

                                      40

                                      Process keyword in VHDL

                                      Process statement is concurrent statement

                                      Statements inside process statements are sequential statements

                                      Process must contain either sensitivity list or wait statement(s) but NOT both

                                      Sensitivity list or wait statement(s) contains signals which wakes process up

                                      General format

                                      Process [(sensitivity_list)]

                                      process_declarative_part

                                      Begin

                                      process_statements

                                      [wait_statement]

                                      End process

                                      41

                                      11 VGA CONTROLLER CODE

                                      library IEEE

                                      use IEEESTD_LOGIC_1164ALL

                                      use IEEESTD_LOGIC_ARITHALL

                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                      use IEEEnumeric_stdALL

                                      ---- Uncomment the following library declaration if instantiating

                                      ---- any Xilinx primitives in this code

                                      --library UNISIM

                                      --use UNISIMVComponentsall

                                      entity vga_controller is port(clk50min std_logic

                                      rstnin std_logic

                                      hsyncout std_logic

                                      vsyncout std_logic

                                      redout std_logic_vector(0 downto 0)

                                      greenout std_logic_vector(0 downto 0)

                                      blueout std_logic_vector(0 downto 0))

                                      end vga_controller

                                      architecture Behavioral of vga_controller is

                                      component memory

                                      port (

                                      addr IN std_logic_VECTOR(17 downto 0)

                                      clk IN std_logic

                                      dout OUT std_logic_VECTOR(0 downto 0)

                                      en IN std_logic

                                      sinit IN std_logic)

                                      END component

                                      42

                                      component sync_unit port(

                                      clkin std_logic

                                      rstnin std_logic

                                      hsyncout std_logic

                                      ---horizontal synch pulse(row)

                                      vsyncout std_logic

                                      --vertical sync pulse(frame) 111110100000000000

                                      v_en_regout std_logic

                                      h_enout std_logic

                                      v_enout std_logic

                                      line_countinout integer

                                      c_hsinout integer)

                                      end component

                                      component pixelgen port(

                                      clkin std_logic

                                      rstnin std_logic

                                      hsyncin std_logic

                                      vsyncin std_logic

                                      h_enin std_logic

                                      v_enin std_logic

                                      v_en_reg in std_logic

                                      data_inin std_logic_vector(0 downto 0)

                                      line_countin integer

                                      c_hsin integer

                                      red out std_logic_vector(0 downto 0)

                                      greenout std_logic_vector(0 downto 0)

                                      blueout std_logic_vector(0 downto 0)

                                      addrout std_logic_vector(17 downto 0)

                                      rdout std_logic)

                                      end component

                                      43

                                      signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                      signal line_count_sc_hs_saddrinteger

                                      signal data_sstd_logic_vector(0 downto 0)

                                      signal addr_sstd_logic_vector(17 downto 0)

                                      begin

                                      --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                      process(clk50mrstn)

                                      begin

                                      if(rstn = 0)then

                                      clk lt= 0

                                      elsif(clk50m = 1 and clk50mevent)then

                                      clk lt= not clk

                                      end if

                                      --end if

                                      end process

                                      hsync lt= hsync_s

                                      vsync lt= vsync_s

                                      sync sync_unit port map(

                                      clk =gt clk

                                      rstn =gt rstn

                                      hsync =gt hsync_s

                                      vsync =gt vsync_s

                                      v_en_reg =gt v_en_reg_s

                                      line_count =gt line_count_s

                                      h_en =gt h_en_s

                                      v_en =gt v_en_s

                                      c_hs =gt c_hs_s

                                      )

                                      44

                                      pixel pixelgen port map(

                                      clk =gt clk

                                      rstn =gt rstn

                                      hsync =gt hsync_s

                                      vsync =gt vsync_s

                                      h_en =gt h_en_s

                                      v_en =gt v_en_s

                                      v_en_reg =gt v_en_reg_s

                                      data_in =gt data_s

                                      line_count=gt line_count_s

                                      c_hs =gt c_hs_s

                                      red =gt red

                                      green =gt green

                                      blue =gt blue

                                      addr =gt addr_s

                                      rd =gt rd_s

                                      )

                                      mem12 memory port map(

                                      addr =gt addr_s

                                      clk =gt clk

                                      dout =gt data_s

                                      en =gt rd_s

                                      sinit=gt rstn

                                      )

                                      end Behavioral

                                      45

                                      SYNCHRONIZATION BLOCK CODE

                                      library IEEE

                                      use IEEESTD_LOGIC_1164ALL

                                      use IEEESTD_LOGIC_ARITHALL

                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                      entity sync_unit is port(

                                      clkin std_logic

                                      rstnin std_logic

                                      hsyncout std_logic ---horizontal synch pulse(row)

                                      vsyncout std_logic --vertical sync

                                      pulse(frame) 111110100000000000

                                      v_en_regout std_logic

                                      h_enout std_logic

                                      v_enout std_logic

                                      line_countout integer

                                      c_hsout integer )

                                      end sync_unit

                                      architecture arch_sync_unit of sync_unit is

                                      signal h_en_sv_en_sstd_logic

                                      signal line_count_sc_hs_sinteger

                                      begin

                                      c_hs lt= c_hs_s

                                      line_count lt= line_count_s

                                      h_en lt= h_en_s

                                      v_en lt= v_en_s

                                      -- hsync counter

                                      process(clkrstn)

                                      begin

                                      if(rstn = 0)then

                                      46

                                      c_hs_s lt= 0

                                      else

                                      if(clk = 1 and clkevent)then

                                      if(c_hs_s=793)then

                                      c_hs_s lt= 0

                                      else

                                      c_hs_s lt= c_hs_s +1

                                      end if

                                      end if

                                      end if

                                      end process

                                      --vsync counter

                                      --vertical line counter

                                      process(h_en_srstn)

                                      begin

                                      if(rstn = 0)then

                                      line_count_s lt= 0

                                      elsif(h_en_s = 1 and h_en_sevent)then

                                      line_count_s lt= line_count_s + 1

                                      if (line_count_s=515)then

                                      line_count_s lt= 0

                                      end if

                                      end if

                                      --end if

                                      end process

                                      --hysnc pulse

                                      process(clkrstn)

                                      begin

                                      if(rstn = 0)then

                                      47

                                      hsync lt= 1

                                      elsif(clk = 1 and clkevent)then

                                      if(c_hs_s lt=95)then

                                      hsync lt= 0

                                      else

                                      hsync lt= 1

                                      end if

                                      end if

                                      --end if

                                      end process

                                      process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                      this period)

                                      begin

                                      if(rstn = 0)then

                                      h_en_s lt= 0

                                      elsif(clk = 1 and clkevent)then

                                      if((c_hs_s gt=143) and (c_hs_slt=778))then

                                      h_en_s lt= 1

                                      else

                                      h_en_s lt= 0

                                      end if

                                      end if

                                      --end if

                                      end process

                                      ---vysnc pulse

                                      process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                      begin

                                      if(rstn = 0)then

                                      48

                                      v_en_s lt= 0

                                      elsif(clk = 1 and clkevent)then

                                      if((line_count_sgt=35 and line_count_slt=515))then

                                      v_en_s lt= 1

                                      else

                                      v_en_s lt= 0

                                      end if

                                      end if

                                      --end if

                                      end process

                                      process(clkrstn) --Vertical Sync Front porch Back Porch

                                      begin

                                      if(rstn = 0)then

                                      vsync lt= 1

                                      elsif(clk = 1 and clkevent)then

                                      if(line_count_s lt2)then

                                      vsync lt= 0

                                      else

                                      vsync lt= 1

                                      end if

                                      end if

                                      --end if

                                      end process

                                      process(clk)

                                      begin

                                      if(clk = 1 and clkevent)then

                                      v_en_reglt=v_en_s

                                      end if

                                      end process

                                      end arch_sync_unit

                                      49

                                      PIXEL GENERATOR

                                      library IEEE

                                      use IEEESTD_LOGIC_1164ALL

                                      use IEEESTD_LOGIC_ARITHALL

                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                      entity pixelgen is port(

                                      clkin std_logic

                                      rstnin std_logic

                                      hsyncin std_logic

                                      vsyncin std_logic

                                      h_enin std_logic

                                      v_enin std_logic

                                      v_en_reg in std_logic

                                      data_inin std_logic_vector(0 downto 0)

                                      line_countin integer

                                      c_hsin integer

                                      red out std_logic_vector(0 downto 0)

                                      greenout std_logic_vector(0 downto 0)

                                      blueout std_logic_vector(0 downto 0)

                                      addrout std_logic_vector(17 downto 0)

                                      rdout std_logic

                                      )

                                      end pixelgen

                                      architecture behv of pixelgen is

                                      signal addr_sstd_logic_vector(17 downto 0)

                                      begin

                                      addr lt= addr_s

                                      process(clkrstn)

                                      begin

                                      50

                                      if(rstn = 0)then

                                      rd lt= 0

                                      red lt= 0

                                      green lt= 0

                                      blue lt= 0

                                      addr_s lt= x0000 amp 00

                                      elsif (clk = 1 and clkevent)then

                                      if((v_en=1) and (v_en_reg=0))then

                                      addr_slt=(others =gt0)

                                      end if

                                      if(v_en=1 and h_en=1)then

                                      if(line_countlt=514)then

                                      if(c_hs lt= 655)then

                                      rd lt= 1

                                      red lt= data_in

                                      green lt= data_in

                                      blue lt= data_in

                                      addr_s lt=addr_s +000000000000000001

                                      else

                                      rd lt= 0

                                      red lt= 0

                                      green lt= 1

                                      blue lt= 0

                                      end if

                                      else

                                      rd lt= 0

                                      red lt= 1

                                      green lt= 1

                                      blue lt= 1

                                      51

                                      end if

                                      if(addr_s=111011111111111111)then

                                      addr_slt= (others =gt0)

                                      end if

                                      else

                                      red lt= 0

                                      green lt= 0

                                      blue lt= 0

                                      end if

                                      end if

                                      -- end if

                                      --end if

                                      end process

                                      end behv

                                      52

                                      12 FPGA ARCHITECTURE

                                      A field-programmable gate array (FPGA) is a semiconductor device that can

                                      be configured by the customer or designer after manufacturingmdashhence the

                                      name field-programmable FPGAs are programmed using a logic circuit

                                      diagram or a source code in a hardware description language (HDL) to specify

                                      how the chip will work They can be used to implement any logical function

                                      that an application-specific integrated circuit (ASIC) could perform but the

                                      ability to update the functionality after shipping offers advantages for many

                                      applications

                                      FPGAs contain programmable logic components called logic blocks and a

                                      hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                      togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                      can be configured to perform complex combinational functions or merely

                                      simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                      include memory elements which may be simple flip-flops or more complete

                                      blocks of memory

                                      History The FPGA industry sprouted from programmable read only

                                      memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                      both had the option of being programmed in batches in a factory or in the field

                                      (field programmable) however programmable logic was hard-wired between

                                      logic gates

                                      Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                      first commercially viable field programmable gate array in 1985 ndash the

                                      XC2064 The XC2064 had programmable gates and programmable

                                      interconnects between gates the beginnings of a new technology and market

                                      The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                      input lookup tables (LUTs) More than 20 years later Freeman was entered

                                      into the National Inventors Hall of Fame for his invention

                                      53

                                      Some of the industryrsquos foundational concepts and technologies for

                                      programmable logic arrays gates and logic blocks are founded in patents

                                      awarded to David W Page and LuVerne R Peterson in 1985

                                      In the late 1980s the Naval Surface Warfare Department funded an experiment

                                      proposed by Steve Casselman to develop a computer that would implement

                                      600000 reprogrammable gates Casselman was successful and the system was

                                      awarded a patent in 1992

                                      Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                      1990s when competitors sprouted up eroding significant market-share By

                                      1993 Actel was serving about 18 percent of the market

                                      The 1990s were an explosive period of time for FPGAs both in sophistication

                                      and the volume of production In the early 1990s FPGAs were primarily used

                                      in telecommunications and networking By the end of the decade FPGAs

                                      found their way into consumer automotive and industrial applications

                                      FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                      algorithm technology and FPGAs to create a sound recognition device

                                      Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                      to decide the configuration needed to accomplish a sound recognition task

                                      121 Modern developments

                                      A recent trend has been to take the coarse-grained architectural

                                      approach a step further by combining the logic blocks and interconnects of

                                      traditional FPGAs with embedded microprocessors and related peripherals to

                                      form a complete system on a programmable chip This work mirrors the

                                      architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                      Group which combined a reconfigurable CPU architecture on a single chip

                                      called the SB24 That work was done in 1982 Examples of such hybrid

                                      technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                      which include one or more PowerPC processors embedded within the FPGAs

                                      54

                                      logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                      processor in combination with Atmels programmable logic architecture

                                      An alternate approach to using hard-macro processors is to make use of soft

                                      processor cores that are implemented within the FPGA logic (See Soft

                                      processors below)

                                      As previously mentioned many modern FPGAs have the ability to be

                                      reprogrammed at run time and this is leading to the idea of reconfigurable

                                      computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                      suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                      example of a reconfigurable soft processor implemented on FPGAs However

                                      it does not support dynamic reconfiguration at runtime but instead adapts itself

                                      to a specific program

                                      Additionally new non-FPGA architectures are beginning to emerge

                                      Software-configurable microprocessors such as the Stretch S5000 adopt a

                                      hybrid approach by providing an array of processor cores and FPGA-like

                                      programmable cores on the same chip

                                      Gates

                                      1987 9000 gates Xilinx

                                      1992 600000 Naval Surface Warfare Department

                                      Early 2000s Millions

                                      Market size

                                      1985 First commercial FPGA technology invented by Xilinx

                                      1987 $14 million

                                      ~1993 gt$385 million

                                      2005 $19 billion

                                      2010 estimates $275 billion

                                      55

                                      122 FPGA Comparisons

                                      Historically FPGAs have been slower less energy efficient and generally

                                      achieved less functionality than their fixed ASIC counterparts A combination

                                      of volume fabrication improvements research and development and the IO

                                      capabilities of new supercomputers have largely closed the performance gap

                                      between ASICs and FPGAs

                                      Advantages include a shorter time to market ability to re-program in the field

                                      to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                      middle road by developing their hardware on ordinary FPGAs but

                                      manufacture their final version so it can no longer be modified after the design

                                      has been committed

                                      Xilinx claims that several market and technology dynamics are changing the

                                      ASICFPGA paradigm

                                      IC costs are rising aggressively

                                      ASIC complexity has bolstered development time and costs

                                      RampD resources and headcount is decreasing

                                      Revenue losses for slow time-to-market are increasing

                                      Financial constraints in a poor economy are driving low-cost technologies

                                      These trends make FPGAs a better alternative than ASICs for a growing

                                      number of higher-volume applications than they have been historically used

                                      for which the company blames for the growing number of FPGA design starts

                                      (see History)

                                      The primary differences between CPLDs and FPGAs are architectural A

                                      CPLD has a somewhat restrictive structure consisting of one or more

                                      programmable sum-of-products logic arrays feeding a relatively small number

                                      of clocked registers The result of this is less flexibility with the advantage of

                                      more predictable timing delays and a higher logic-to-interconnect ratio The

                                      FPGA architectures on the other hand are dominated by interconnect This

                                      56

                                      makes them far more flexible (in terms of the range of designs that are

                                      practical for implementation within them) but also far more complex to design

                                      for

                                      Another notable difference between CPLDs and FPGAs is the presence in most

                                      FPGAs of higher-level embedded functions (such as adders and multipliers)

                                      and embedded memories as well as to have logic blocks implement decoders

                                      or mathematical functions

                                      Some FPGAs have the capability of partial re-configuration that lets one

                                      portion of the device be re-programmed while other portions continue running

                                      123 Applications

                                      Applications of FPGAs include digital signal processing software-defined

                                      radio aerospace and defense systems ASIC prototyping medical imaging

                                      computer vision speech recognition cryptography bioinformatics computer

                                      hardware emulation radio astronomy and a growing range of other areas

                                      FPGAs originally began as competitors to CPLDs and competed in a similar

                                      space that of glue logic for PCBs As their size capabilities and speed

                                      increased they began to take over larger and larger functions to the state where

                                      some are now marketed as full systems on chips (SoC) Particularly with the

                                      introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                      applications which had traditionally been the sole reserve of DSPs began to

                                      incorporate FPGAs instead

                                      FPGAs especially find applications in any area or algorithm that can make use

                                      of the massive parallelism offered by their architecture One such area is code

                                      breaking in particular brute-force attack of cryptographic algorithms

                                      FPGAs are increasingly used in conventional high performance computing

                                      applications where computational kernels such as FFT or Convolution are

                                      performed on the FPGA instead of a microprocessor

                                      57

                                      The inherent parallelism of the logic resources on an FPGA allows for

                                      considerable computational throughput even at a low MHz clock rates The

                                      flexibility of the FPGA allows for even higher performance by trading off

                                      precision and range in the number format for an increased number of parallel

                                      arithmetic units This has driven a new type of processing called

                                      reconfigurable computing where time intensive tasks are offloaded from

                                      software to FPGAs

                                      The adoption of FPGAs in high performance computing is currently limited by

                                      the complexity of FPGA design compared to conventional software and the

                                      extremely long turn-around times of current design tools where 4-8 hours wait

                                      is necessary after even minor changes to the source code

                                      Traditionally FPGAs have been reserved for specific vertical applications

                                      where the volume of production is small For these low-volume applications

                                      the premium that companies pay in hardware costs per unit for a

                                      programmable chip is more affordable than the development resources spent

                                      on creating an ASIC for a low-volume application Today new cost and

                                      performance dynamics have broadened the range of viable applications

                                      Architecture

                                      The most common FPGA architecture consists of an array of configurable

                                      logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                      channels have the same width (number of wires) Multiple IO pads may fit

                                      into the height of one row or the width of one column in the array

                                      An application circuit must be mapped into an FPGA with adequate resources

                                      While the number of CLBs and IOs required is easily determined from the

                                      design the number of routing tracks needed may vary considerably even

                                      among designs with the same amount of logic (For example a crossbar switch

                                      requires much more routing than a systolic array with the same gate count)

                                      Since unused routing tracks increase the cost (and decrease the performance)

                                      of the part without providing any benefit FPGA manufacturers try to provide

                                      58

                                      just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                      can be routed This is determined by estimates such as those derived from

                                      Rents rule or by experiments with existing designs

                                      The FPGA is an array or island-style FPGA It consists of an array of logic

                                      blocks and routing channels Two IO pads fit into the height of one row or the

                                      width of one column as shown below All the routing channels have the same

                                      width (number of wires)

                                      121 FPGA structure

                                      A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                      flip-flop as shown below In recent years manufacturers have started moving

                                      to 6-input LUTs in their high performance parts claiming increased

                                      performance

                                      Typical logic block

                                      59

                                      There is only one output which can be either the registered or the unregistered

                                      LUT output The logic block has four inputs for the LUT and a clock input

                                      Since clock signals (and often other high-fanout signals) are normally routed

                                      via special-purpose dedicated routing networks in commercial FPGAs they

                                      and other signals are separately managed

                                      For this example architecture the locations of the FPGA logic block pins are

                                      shown below

                                      Logic Block Pin Locations

                                      Each input is accessible from one side of the logic block while the output pin

                                      can connect to routing wires in both the channel to the right and the channel

                                      below the logic block

                                      Each logic block output pin can connect to any of the wiring segments in the

                                      channels adjacent to it

                                      Similarly an IO pad can connect to any one of the wiring segments in the

                                      channel adjacent to it For example an IO pad at the top of the chip can

                                      connect to any of the W wires (where W is the channel width) in the horizontal

                                      channel immediately below it

                                      Generally the FPGA routing is unsegmented That is each wiring segment

                                      spans only one logic block before it terminates in a switch box By turning on

                                      60

                                      some of the programmable switches within a switch box longer paths can be

                                      constructed For higher speed interconnect some FPGA architectures use

                                      longer routing lines that span multiple logic blocks

                                      Whenever a vertical and a horizontal channel intersect there is a switch box In

                                      this architecture when a wire enters a switch box there are three

                                      programmable switches that allow it to connect to three other wires in adjacent

                                      channel segments The pattern or topology of switches used in this

                                      architecture is the planar or domain-based switch box topology In this switch

                                      box topology a wire in track number one connects only to wires in track

                                      number one in adjacent channel segments wires in track number 2 connect

                                      only to other wires in track number 2 and so on The figure below illustrates

                                      the connections in a switch box

                                      Switch box topology

                                      Modern FPGA families expand upon the above capabilities to include higher

                                      level functionality fixed into the silicon Having these common functions

                                      embedded into the silicon reduces the area required and gives those functions

                                      increased speed compared to building them from primitives Examples of these

                                      include multipliers generic DSP blocks embedded processors high speed IO

                                      logic and embedded memories

                                      61

                                      FPGAs are also widely used for systems validation including pre-silicon

                                      validation post-silicon validation and firmware development This allows chip

                                      companies to validate their design before the chip is produced in the factory

                                      reducing the time to market

                                      124 FPGA Design and Programming

                                      To define the behavior of the FPGA the user provides a hardware description

                                      language (HDL) or a schematic design The HDL form might be easier to work

                                      with when handling large structures because its possible to just specify them

                                      numerically rather than having to draw every piece by hand On the other hand

                                      schematic entry can allow for easier visualisation of a design

                                      Then using an electronic design automation tool a technology-mapped netlist

                                      is generated The netlist can then be fitted to the actual FPGA architecture

                                      using a process called place-and-route usually performed by the FPGA

                                      companys proprietary place-and-route software The user will validate the

                                      map place and route results via timing analysis simulation and other

                                      verification methodologies Once the design and validation process is

                                      complete the binary file generated (also using the FPGA companys

                                      proprietary software) is used to (re)configure the FPGA

                                      Going from schematicHDL source files to actual configuration The source

                                      files are fed to a software suite from the FPGACPLD vendor that through

                                      different steps will produce a file This file is then transferred to the

                                      FPGACPLD via a serial interface (JTAG) or to an external memory device

                                      like an EEPROM

                                      The most common HDLs are VHDL and Verilog although in an attempt to

                                      reduce the complexity of designing in HDLs which have been compared to the

                                      equivalent of assembly languages there are moves to raise the abstraction level

                                      through the introduction of alternative languages

                                      62

                                      To simplify the design of complex systems in FPGAs there exist libraries of

                                      predefined complex functions and circuits that have been tested and optimized

                                      to speed up the design process These predefined circuits are commonly called

                                      IP cores and are available from FPGA vendors and third-party IP suppliers

                                      (rarely free and typically released under proprietary licenses) Other

                                      predefined circuits are available from developer communities such as Open

                                      Cores (typically free and released under the GPL BSD or similar license) and

                                      other sources

                                      In a typical design flow an FPGA application developer will simulate the

                                      design at multiple stages throughout the design process Initially the RTL

                                      description in VHDL or Verilog is simulated by creating test benches to

                                      simulate the system and observe results Then after the synthesis engine has

                                      mapped the design to a netlist the netlist is translated to a gate level

                                      description where simulation is repeated to confirm the synthesis proceeded

                                      without errors Finally the design is laid out in the FPGA at which point

                                      propagation delays can be added and the simulation run again with these

                                      values back-annotated onto the netlist

                                      Basic Process Technology Types

                                      SRAM - based on static memory technology In-system programmable and re-

                                      programmable Requires external boot devices CMOS

                                      Antifuse - One-time programmable CMOS

                                      EPROM - Erasable Programmable Read-Only Memory technology Usually

                                      one-time programmable in production because of plastic packaging

                                      Windowed devices can be erased with ultraviolet (UV) light CMOS

                                      EEPROM - Electrically Erasable Programmable Read-Only Memory

                                      technology Can be erased even in plastic packages Some but not all

                                      EEPROM devices can be in-system programmed CMOS

                                      Flash - Flash-erase EPROM technology Can be erased even in plastic

                                      packages Some but not all flash devices can be in-system programmed

                                      63

                                      Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                      therefore less expensive to manufacture CMOS

                                      Fuse - One-time programmable Bipolar

                                      Major Manufacturers

                                      Xilinx and Altera are the current FPGA market leaders and long-time industry

                                      rivals Together they control over 80 percent of the market with Xilinx alone

                                      representing over 50 percent

                                      Xilinx also provides free Windows and Linux design software while Altera

                                      provides free Windows tools the Solaris and Linux tools are only available via

                                      a rental scheme

                                      Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                      (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                      Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                      focused CSSP no general purpose FPGAs)

                                      125 FPGA prototype

                                      FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                      prototyping is the method to prototype SoC and ASIC design on FPGA for

                                      hardware verification and early software development

                                      Main stream verification methods for hardware design and early software and

                                      firmware co-design has become mainstream Prototyping SoC and ASIC

                                      design on FPGA has become a good method to do this

                                      64

                                      Reasons why Prototyping is important

                                      1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                      functionally correct This is compared to designers only relying on software

                                      simulations to verify that their hardware design is sound Simulation speed and

                                      modeling accuracy limitations hinder this development

                                      2 Due to time constrains many projects cannot wait until the silicon is back from

                                      the foundry to start on software tests FPGA prototyping allows for much more

                                      time in area of software development and testing at the software-hardware

                                      integration stage This allows many unforeseen software bugs that appear due

                                      to todays array of operating systems applications and hardware

                                      3 Prototyping also allows the developer to ensure that all IP technologies on his

                                      system work well together off the simulation stage and in actual form

                                      4 Prototyping has the added advantage as demo platforms to SoC clients

                                      bringing in interest early This speeds up the overall development cycle and

                                      allows for more enhancement or improvement to the chip features as it would

                                      otherwise have been

                                      65

                                      13 SIMULATION RESULTS

                                      66

                                      GATE LEVEL

                                      FigVGA CONTROLLER

                                      67

                                      GATE LEVEL

                                      Fig SYNC UNIT

                                      68

                                      TECHNOLOGY SCHEMATIC

                                      69

                                      14 APPLICATIONS

                                      Motion Tracking Systems

                                      Image processing systems

                                      Displaying Systems

                                      141 CONCLUTION

                                      Hence by implementing FPGA taking a processor-generated picture

                                      (pixilated) from memory space the Controller provides digital RGB values for

                                      each pixel as well as horizontal and vertical synchronization signals in order

                                      to correctly display the picture on a connected monitor

                                      70

                                      REFERENCES

                                      1 httpenwikipediaorgwikiVideo_Graphics_Array

                                      2 A VGA display controller by Eduardo Sanchez

                                      3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                      71

                                      • Pixel
                                        • 6 SIGNALS IN VGA CONTROLLER
                                        • Gates
                                        • Market size
                                        • 122 FPGA Comparisons

                                        VSYNC O Falling Vertical synchronization signal This signal is used to control the vertical deflection circuit in the VGA monitor so that the start and end of a frame (of lines) is correctly displayed between the top and bottom edges of the visible display area of the screen The vertical size of the display area is controlled by the DISPSIZE_V input to the Controller

                                        R1 O High Low Provides the 2-bit digital signal for the intensity of red used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                        G1 O High Low Provides the 2-bit digital signal for the intensity of green used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                        S

                                        B1 O High Low Provides the 2-bit digital signal for the intensity of blue used in composing a pixels displayed color These two signals are inputs to a simple 2-bit DAC (external to the Controller) that provides the analog signal required by the VGA monitor

                                        20

                                        9 VGA SIGNAL TIMING

                                        Figure 3 summarizes the signal timing involved in sending a line of

                                        pixels and a frame of lines The actual time values differ according to the

                                        resolution selected (640x480 or 800x600) the processor-defined values for

                                        DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                                        (CLK)

                                        21

                                        91 VGA PIXEL UNIT

                                        The Pixel Unit provides access to the pixilated image stored in external

                                        picture memory reading in data a byte at a time and formatting each pixel to

                                        be displayed For each pixel the 6-bit RGB value required for the control of

                                        the monitorrsquos three primary color electron guns is generated so that the pixel

                                        is displayed on the screen with the correct color

                                        Accessing the stored image

                                        The image to be displayed on the monitor screen is written by the host

                                        microcontroller into external memory space (RAM) This memory space can

                                        be located anywhere (eg using a block of RAM within the FPGA design or

                                        using the SRAM on the NanoBoard itself)

                                        Pixel data is stored in the memory space in bytes The number of pixels in a

                                        byte depends on which color palette is being used for the image (selected by

                                        the CMOD input)

                                        bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                                        bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                                        bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                                        (DATA[76] are not used)

                                        22

                                        The size of memory required to store a picture is determined by the total

                                        number of viewable pixels in a line (determined by DISPSIZE_H) the total

                                        number of viewable lines in a frame (determined by DISPSIZE_V) and the

                                        number of pixels stored in each byte in memory space

                                        Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                                        The address in RAM where the next pixel is stored is determined using

                                        an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                                        reflects the current contents of the MEMADDR register The exact addressing

                                        is described below

                                        92 Black amp White mode

                                        The picture memory address ndash the byte of data containing the next 8 pixels

                                        ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                                        three

                                        ADDR_PIXEL = 00 amp hvcnt[183]

                                        16 Colors mode

                                        The picture memory address ndash the byte of data containing the next 2 pixels

                                        ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                                        one

                                        ADDR_PIXEL = 0 amp hvcnt[181]

                                        64 Colors mode

                                        The picture memory address ndash the byte of data containing the next pixel ndash is

                                        determined by using the full value of hvcnt

                                        ADDR_PIXEL = hvcnt[180]

                                        The Pixel register (PIXREG)

                                        The Pixel register is used to receive the byte of pixel data read from the

                                        current address in memory space The register PIXREG is reset to zero when

                                        the VGA Controller receives an external reset signal (RST)

                                        23

                                        The register is updated on each rising edge of the CLK signal Data can be

                                        read from the memory space as long as the RD signal is active (High) The RD

                                        signal is itself controlled by the external line display enable signal enex This

                                        internally generated signal is defined as follows

                                        If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                                        (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                                        Else

                                        enex = 1

                                        When enex is Low read access from memory is disabled (RD = 0)

                                        The point at which data is loaded from memory into PIXREG depends on the

                                        particular color palette that is chosen ndash Black amp White 16 Colors or 64

                                        Colors

                                        Black amp White mode

                                        The next byte of data will be loaded into the register whenever the

                                        lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                                        are ldquo000rdquo

                                        For the currently loaded byte the active pixel is always in the lowest bit

                                        position of the Pixel register Each pixel in the data byte is moved into this

                                        active pixel position by shifting the contents of the register right by one bit on

                                        each rising edge of CLK

                                        16 Colors mode

                                        The next byte of data will be loaded into the register whenever the

                                        lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                                        0

                                        For the currently loaded byte the active pixel is always in the low order

                                        nibble of the Pixel register Remember that in this mode each byte of data

                                        contains two pixels The second pixel is moved into this active pixel position

                                        by shifting the contents of the register right by four bits on the rising edge of

                                        CLK

                                        24

                                        64 Colors mode

                                        The next byte of data will be loaded into the register on the rising edge

                                        of the external system clock signal (CLK) In this mode the read of pixel data

                                        does not depend on the status of the hvcnt signal received from the

                                        Synchronization Unit

                                        93 The RGB register (RGBREG)

                                        The RGB register is used to store the six bits that are required for

                                        driving the red green and blue color guns of the target monitor When the

                                        chosen color palette is either Black amp White or 16 Colors these six bits are

                                        obtained by mapping the value of the active pixel to a predefined RGB code

                                        When the chosen color palette is 64 Colors the actual pixel value is used

                                        directly

                                        The register RGBREG is reset to zero (000000) when the VGA

                                        Controller receives an external reset signal (RST) This RGB code represents

                                        black

                                        The register is updated on each rising edge of the CLK signal and the

                                        value loaded is dependent on the state of the line display enable signal en

                                        When en is Low blanking is required and RGBREG is loaded with the code

                                        for black (000000)

                                        Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                        and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                        25

                                        26

                                        27

                                        The RGB color code stored in the RGB register is output from the VGA

                                        Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                        and B1)

                                        The monitor itself expects analog signals as inputs to its electron gun control

                                        circuits This is achieved by using 2-bit digital to analog converter circuitry

                                        located on the Nano Board itself as shown in Figure 4

                                        Figure 91 Digital to Analog RGB conversion

                                        28

                                        For each color the 2-bit digital signal from the VGA Controller can be

                                        converted into 4 distinct analog levels These levels specify the intensity of

                                        each of the three primary colors to use when displaying the pixel on the

                                        monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                        brightness)

                                        With each analog input being one of four possible levels the monitor

                                        can display each pixel on the screen with one of 64 different color

                                        permutations

                                        Using Block RAM in Spartan-3E Generation

                                        FPGAs

                                        For applications requiring large on-chip memories Spartantrade-3

                                        Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                        Using various configuration options Select RAM blocks create RAM ROM

                                        FIFOs large look-up tables data width converterscircular buffers and shift

                                        registers each supporting various data widths and depths This application note

                                        describes the features and capabilities of block Select RAM and illustrates how

                                        to specify the various options using the Xilinx CORE Generatortrade system or

                                        via VHDL or Verilog instantiation

                                        29

                                        Table 2 Block RAM Available in Spartan-3E Devices

                                        Devi

                                        ce

                                        R

                                        A

                                        M

                                        Co

                                        lu

                                        m

                                        ns

                                        R

                                        A

                                        M

                                        Bl

                                        oc

                                        ks

                                        pe

                                        r

                                        C

                                        ol

                                        u

                                        m

                                        n

                                        T

                                        o

                                        t

                                        a

                                        l

                                        R

                                        A

                                        M

                                        B

                                        l

                                        o

                                        c

                                        k

                                        s

                                        T

                                        o

                                        t

                                        a

                                        l

                                        R

                                        A

                                        M

                                        B

                                        it

                                        s

                                        T

                                        o

                                        t

                                        a

                                        l

                                        R

                                        A

                                        M

                                        K

                                        b

                                        i

                                        t

                                        s

                                        XC3

                                        S100

                                        E

                                        1 4 4 7

                                        3

                                        7

                                        2

                                        8

                                        7

                                        2

                                        XC3

                                        S250

                                        E

                                        2 6 1

                                        2

                                        2

                                        2

                                        1

                                        1

                                        8

                                        4

                                        2

                                        1

                                        6

                                        XC3

                                        S500

                                        E

                                        2 10 2

                                        0

                                        3

                                        6

                                        8

                                        6

                                        4

                                        3

                                        6

                                        0

                                        30

                                        0

                                        XC3

                                        S120

                                        0E

                                        2 14 2

                                        8

                                        5

                                        1

                                        6

                                        0

                                        9

                                        6

                                        5

                                        0

                                        4

                                        XC3

                                        S160

                                        0E

                                        2 18 3

                                        6

                                        6

                                        6

                                        3

                                        5

                                        5

                                        2

                                        6

                                        4

                                        8

                                        Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                        which is allocated to data storage and in some memory configurations an

                                        additional 2K bits allocated to parity or additional plus data bits Physically

                                        the block RAM memory has two completely independent access ports labeled

                                        Port A and Port B The structure is fully symmetrical and both ports are

                                        interchangeable and both ports support data read and write operations Each

                                        memory port is synchronous with its own clock clock enable and write

                                        enable Read operations are also synchronous and require a clock edge and

                                        clock enable Though physically a dual-port memory block RAM simulates

                                        single-port memory in an application as shown in Figure 1 Furthermore each

                                        block memory supports multiple configurations or aspect ratios Table 3

                                        summarizes the essential SelectRAM features Cascade multiple block RAMs

                                        to create deeper and wider memory organizations with a minimal timing

                                        penalty incurred through specialized routing resources

                                        31

                                        32

                                        10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                        Why (V) HDL

                                        Interoperability

                                        Technology independence

                                        Design reuse

                                        Several levels of abstraction

                                        Readability

                                        Standard language

                                        Widely supported

                                        What is VHDL

                                        VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                        IC)

                                        Design specification language

                                        Design entry language

                                        Design simulation language

                                        Design documentation language

                                        An alternative to schematics

                                        101 Brief History

                                        VHDL Was developed in the early 1980s for managing design problems that

                                        involved large circuits and multiple teams of engineers

                                        Funded by US Department of Defence

                                        33

                                        The first publicly available version was released in 1985

                                        In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                        presented with a proposal to standardize the VHDL

                                        In 1987 standardization =gt IEEE 1076-1987

                                        An improved version of the language was released in 1994 =gt IEEE

                                        standard1076-1993

                                        102 Related Standards

                                        IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                        impedance

                                        Soon after IEEE 1076-1987 was released simulator companies began using

                                        their own non-standard types =gt VHDL was becoming a nonstandard

                                        IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                        definitions for a nine-valued data type std_logic

                                        IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                        to actual hardware

                                        Defines eg two numeric types signed and unsigned

                                        VHDL Environment

                                        Design Units

                                        34

                                        Segments of VHDL code that can be compiled separately and stored in a

                                        library

                                        Entities

                                        A black box with interface definition

                                        Defines the inputsoutputs of a component (define pins)

                                        A way to represent modularity in VHDL

                                        Similar to symbol in schematic

                                        Entity declaration describes entity

                                        Eg

                                        entity Comparator is

                                        port (A B in std_logic_vector(7 downto0)

                                        EQ out std_logic)

                                        end Comparator

                                        Ports Provide channels of communication between the component and its

                                        environment

                                        Each port must have a name direction and a type

                                        An entity may have NO port declaration

                                        35

                                        Port directions In A value of a port can be read inside the component but cannot be assigned

                                        Multiple reads of port are allowed

                                        Out Assignments can be made to a port but data from a port cannot be read

                                        Multiple assignments are allowed

                                        In out Bi-directional assignments can be made and data can be read Multiple

                                        assignments are allowed

                                        Buffer An out port with read capability May have at most one assignment

                                        (are not recommended)

                                        Architectures Every entity has at least one architecture

                                        One entity can have several architectures

                                        Architectures can describe design using

                                        BehaviorndashStructurendashDataflow

                                        Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                        Transfer Level)ndashBehavioral level

                                        Configuration declaration links architecture to entity

                                        Eg

                                        Architecture Comparator1 of Comparator is

                                        Begin

                                        EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                        End Comparator1

                                        Configurations Links entity declaration and architecture body together

                                        Concept of default configuration is a bit messy in VHDL lsquo87

                                        ndashLast architecture analyzed links to entity

                                        Can be used to change simulation behavior without re-analyzing the VHDL

                                        source

                                        Complex configuration declarations are ignored in synthesis

                                        36

                                        Some entities can have eggate level architecture and behavioral

                                        architecture

                                        Are always optional

                                        PackagesPackages contain information common to many design units

                                        1 Package declaration

                                        --constant declarations

                                        ndashtype and subtype declarations

                                        ndashfunction and procedure declarations

                                        ndashglobal signal declarations

                                        ndashfile declarations

                                        ndashcomponent declarations

                                        2 Package body

                                        ndashis not necessary needed

                                        ndashfunction bodies

                                        ndashprocedure bodies

                                        Packages are meant for encapsuling data which can be shared globally among

                                        Several design units

                                        Consists of declaration part and optional body part

                                        Package declaration can contain

                                        ndashtype and subtype declarations

                                        ndashsubprograms

                                        ndashconstants

                                        Alias declarations

                                        ndashglobal signal declarations

                                        ndashfile declarations

                                        ndashcomponent declarations

                                        37

                                        Package body consists of

                                        ndashsubprogram declarations and bodies

                                        ndashtype and subtype declarations

                                        ndash deferred constants

                                        ndash file declarations

                                        LibrariesCollection of VHDL design units (database)

                                        1 Packages

                                        package declaration

                                        package body

                                        2 Entities (entity declaration)

                                        3 Architectures (architecture body)

                                        4 Configurations (configuration declarations)

                                        Usually directory in UNIX file system

                                        Can be also any other kind of database

                                        Levels of Abstraction

                                        VHDL supports many possible styles of design description which differ

                                        primarily in how closely they relate to the HW

                                        It is possible to describe a circuit in a number of ways

                                        Structural-------

                                        Dataflow ------- Higher level of abstraction

                                        Behavioral -------

                                        Structural VHDL description

                                        Circuit is described in terms of its components

                                        From a low-level description (eg transistor-level description) to a high level

                                        description (eg block diagram)

                                        For large circuits a low-level description quickly becomes impractical

                                        103 Dataflow VHDL Description

                                        38

                                        Circuit is described in terms of how data moves through the system

                                        In the dataflow style you describe how information flows between registers in

                                        the system

                                        The combinational logic is described at a relatively high level the placement

                                        and operation of registers is specified quite precisely

                                        The behavior of the system over the time is defined by registers

                                        There are no build-in registers in VHDL-language

                                        ndashEither lower level description

                                        ndashor behavioral description of sequential elements is needed

                                        The lower level register descriptions must be created or obtained

                                        If there is no 3rd party models for registers =gt you must write the behavioral

                                        description of registers

                                        The behavioral description can be provided in the form of

                                        subprograms(functions or procedures)

                                        104 Behavioral VHDL Description

                                        Circuit is described in terms of its operation over time

                                        39

                                        Representation might include eg state diagrams timing diagrams and

                                        algorithmic descriptions

                                        The concept of time may be expressed precisely using delays (eg A lt= B

                                        after 10 ns)

                                        If no actual delay is used order of sequential operations is defined

                                        In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                        timing specifications

                                        The actual timing results depend on implementation technology and efficiency

                                        of synthesis tool

                                        There are a few tools for behavioral synthesis

                                        Concurrent Vs Sequential

                                        Processes

                                        Basic simulation concept in VHDL

                                        VHDL description can always be broken up to interconnected processes

                                        Quite similar to Unix process

                                        40

                                        Process keyword in VHDL

                                        Process statement is concurrent statement

                                        Statements inside process statements are sequential statements

                                        Process must contain either sensitivity list or wait statement(s) but NOT both

                                        Sensitivity list or wait statement(s) contains signals which wakes process up

                                        General format

                                        Process [(sensitivity_list)]

                                        process_declarative_part

                                        Begin

                                        process_statements

                                        [wait_statement]

                                        End process

                                        41

                                        11 VGA CONTROLLER CODE

                                        library IEEE

                                        use IEEESTD_LOGIC_1164ALL

                                        use IEEESTD_LOGIC_ARITHALL

                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                        use IEEEnumeric_stdALL

                                        ---- Uncomment the following library declaration if instantiating

                                        ---- any Xilinx primitives in this code

                                        --library UNISIM

                                        --use UNISIMVComponentsall

                                        entity vga_controller is port(clk50min std_logic

                                        rstnin std_logic

                                        hsyncout std_logic

                                        vsyncout std_logic

                                        redout std_logic_vector(0 downto 0)

                                        greenout std_logic_vector(0 downto 0)

                                        blueout std_logic_vector(0 downto 0))

                                        end vga_controller

                                        architecture Behavioral of vga_controller is

                                        component memory

                                        port (

                                        addr IN std_logic_VECTOR(17 downto 0)

                                        clk IN std_logic

                                        dout OUT std_logic_VECTOR(0 downto 0)

                                        en IN std_logic

                                        sinit IN std_logic)

                                        END component

                                        42

                                        component sync_unit port(

                                        clkin std_logic

                                        rstnin std_logic

                                        hsyncout std_logic

                                        ---horizontal synch pulse(row)

                                        vsyncout std_logic

                                        --vertical sync pulse(frame) 111110100000000000

                                        v_en_regout std_logic

                                        h_enout std_logic

                                        v_enout std_logic

                                        line_countinout integer

                                        c_hsinout integer)

                                        end component

                                        component pixelgen port(

                                        clkin std_logic

                                        rstnin std_logic

                                        hsyncin std_logic

                                        vsyncin std_logic

                                        h_enin std_logic

                                        v_enin std_logic

                                        v_en_reg in std_logic

                                        data_inin std_logic_vector(0 downto 0)

                                        line_countin integer

                                        c_hsin integer

                                        red out std_logic_vector(0 downto 0)

                                        greenout std_logic_vector(0 downto 0)

                                        blueout std_logic_vector(0 downto 0)

                                        addrout std_logic_vector(17 downto 0)

                                        rdout std_logic)

                                        end component

                                        43

                                        signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                        signal line_count_sc_hs_saddrinteger

                                        signal data_sstd_logic_vector(0 downto 0)

                                        signal addr_sstd_logic_vector(17 downto 0)

                                        begin

                                        --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                        process(clk50mrstn)

                                        begin

                                        if(rstn = 0)then

                                        clk lt= 0

                                        elsif(clk50m = 1 and clk50mevent)then

                                        clk lt= not clk

                                        end if

                                        --end if

                                        end process

                                        hsync lt= hsync_s

                                        vsync lt= vsync_s

                                        sync sync_unit port map(

                                        clk =gt clk

                                        rstn =gt rstn

                                        hsync =gt hsync_s

                                        vsync =gt vsync_s

                                        v_en_reg =gt v_en_reg_s

                                        line_count =gt line_count_s

                                        h_en =gt h_en_s

                                        v_en =gt v_en_s

                                        c_hs =gt c_hs_s

                                        )

                                        44

                                        pixel pixelgen port map(

                                        clk =gt clk

                                        rstn =gt rstn

                                        hsync =gt hsync_s

                                        vsync =gt vsync_s

                                        h_en =gt h_en_s

                                        v_en =gt v_en_s

                                        v_en_reg =gt v_en_reg_s

                                        data_in =gt data_s

                                        line_count=gt line_count_s

                                        c_hs =gt c_hs_s

                                        red =gt red

                                        green =gt green

                                        blue =gt blue

                                        addr =gt addr_s

                                        rd =gt rd_s

                                        )

                                        mem12 memory port map(

                                        addr =gt addr_s

                                        clk =gt clk

                                        dout =gt data_s

                                        en =gt rd_s

                                        sinit=gt rstn

                                        )

                                        end Behavioral

                                        45

                                        SYNCHRONIZATION BLOCK CODE

                                        library IEEE

                                        use IEEESTD_LOGIC_1164ALL

                                        use IEEESTD_LOGIC_ARITHALL

                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                        entity sync_unit is port(

                                        clkin std_logic

                                        rstnin std_logic

                                        hsyncout std_logic ---horizontal synch pulse(row)

                                        vsyncout std_logic --vertical sync

                                        pulse(frame) 111110100000000000

                                        v_en_regout std_logic

                                        h_enout std_logic

                                        v_enout std_logic

                                        line_countout integer

                                        c_hsout integer )

                                        end sync_unit

                                        architecture arch_sync_unit of sync_unit is

                                        signal h_en_sv_en_sstd_logic

                                        signal line_count_sc_hs_sinteger

                                        begin

                                        c_hs lt= c_hs_s

                                        line_count lt= line_count_s

                                        h_en lt= h_en_s

                                        v_en lt= v_en_s

                                        -- hsync counter

                                        process(clkrstn)

                                        begin

                                        if(rstn = 0)then

                                        46

                                        c_hs_s lt= 0

                                        else

                                        if(clk = 1 and clkevent)then

                                        if(c_hs_s=793)then

                                        c_hs_s lt= 0

                                        else

                                        c_hs_s lt= c_hs_s +1

                                        end if

                                        end if

                                        end if

                                        end process

                                        --vsync counter

                                        --vertical line counter

                                        process(h_en_srstn)

                                        begin

                                        if(rstn = 0)then

                                        line_count_s lt= 0

                                        elsif(h_en_s = 1 and h_en_sevent)then

                                        line_count_s lt= line_count_s + 1

                                        if (line_count_s=515)then

                                        line_count_s lt= 0

                                        end if

                                        end if

                                        --end if

                                        end process

                                        --hysnc pulse

                                        process(clkrstn)

                                        begin

                                        if(rstn = 0)then

                                        47

                                        hsync lt= 1

                                        elsif(clk = 1 and clkevent)then

                                        if(c_hs_s lt=95)then

                                        hsync lt= 0

                                        else

                                        hsync lt= 1

                                        end if

                                        end if

                                        --end if

                                        end process

                                        process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                        this period)

                                        begin

                                        if(rstn = 0)then

                                        h_en_s lt= 0

                                        elsif(clk = 1 and clkevent)then

                                        if((c_hs_s gt=143) and (c_hs_slt=778))then

                                        h_en_s lt= 1

                                        else

                                        h_en_s lt= 0

                                        end if

                                        end if

                                        --end if

                                        end process

                                        ---vysnc pulse

                                        process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                        begin

                                        if(rstn = 0)then

                                        48

                                        v_en_s lt= 0

                                        elsif(clk = 1 and clkevent)then

                                        if((line_count_sgt=35 and line_count_slt=515))then

                                        v_en_s lt= 1

                                        else

                                        v_en_s lt= 0

                                        end if

                                        end if

                                        --end if

                                        end process

                                        process(clkrstn) --Vertical Sync Front porch Back Porch

                                        begin

                                        if(rstn = 0)then

                                        vsync lt= 1

                                        elsif(clk = 1 and clkevent)then

                                        if(line_count_s lt2)then

                                        vsync lt= 0

                                        else

                                        vsync lt= 1

                                        end if

                                        end if

                                        --end if

                                        end process

                                        process(clk)

                                        begin

                                        if(clk = 1 and clkevent)then

                                        v_en_reglt=v_en_s

                                        end if

                                        end process

                                        end arch_sync_unit

                                        49

                                        PIXEL GENERATOR

                                        library IEEE

                                        use IEEESTD_LOGIC_1164ALL

                                        use IEEESTD_LOGIC_ARITHALL

                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                        entity pixelgen is port(

                                        clkin std_logic

                                        rstnin std_logic

                                        hsyncin std_logic

                                        vsyncin std_logic

                                        h_enin std_logic

                                        v_enin std_logic

                                        v_en_reg in std_logic

                                        data_inin std_logic_vector(0 downto 0)

                                        line_countin integer

                                        c_hsin integer

                                        red out std_logic_vector(0 downto 0)

                                        greenout std_logic_vector(0 downto 0)

                                        blueout std_logic_vector(0 downto 0)

                                        addrout std_logic_vector(17 downto 0)

                                        rdout std_logic

                                        )

                                        end pixelgen

                                        architecture behv of pixelgen is

                                        signal addr_sstd_logic_vector(17 downto 0)

                                        begin

                                        addr lt= addr_s

                                        process(clkrstn)

                                        begin

                                        50

                                        if(rstn = 0)then

                                        rd lt= 0

                                        red lt= 0

                                        green lt= 0

                                        blue lt= 0

                                        addr_s lt= x0000 amp 00

                                        elsif (clk = 1 and clkevent)then

                                        if((v_en=1) and (v_en_reg=0))then

                                        addr_slt=(others =gt0)

                                        end if

                                        if(v_en=1 and h_en=1)then

                                        if(line_countlt=514)then

                                        if(c_hs lt= 655)then

                                        rd lt= 1

                                        red lt= data_in

                                        green lt= data_in

                                        blue lt= data_in

                                        addr_s lt=addr_s +000000000000000001

                                        else

                                        rd lt= 0

                                        red lt= 0

                                        green lt= 1

                                        blue lt= 0

                                        end if

                                        else

                                        rd lt= 0

                                        red lt= 1

                                        green lt= 1

                                        blue lt= 1

                                        51

                                        end if

                                        if(addr_s=111011111111111111)then

                                        addr_slt= (others =gt0)

                                        end if

                                        else

                                        red lt= 0

                                        green lt= 0

                                        blue lt= 0

                                        end if

                                        end if

                                        -- end if

                                        --end if

                                        end process

                                        end behv

                                        52

                                        12 FPGA ARCHITECTURE

                                        A field-programmable gate array (FPGA) is a semiconductor device that can

                                        be configured by the customer or designer after manufacturingmdashhence the

                                        name field-programmable FPGAs are programmed using a logic circuit

                                        diagram or a source code in a hardware description language (HDL) to specify

                                        how the chip will work They can be used to implement any logical function

                                        that an application-specific integrated circuit (ASIC) could perform but the

                                        ability to update the functionality after shipping offers advantages for many

                                        applications

                                        FPGAs contain programmable logic components called logic blocks and a

                                        hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                        togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                        can be configured to perform complex combinational functions or merely

                                        simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                        include memory elements which may be simple flip-flops or more complete

                                        blocks of memory

                                        History The FPGA industry sprouted from programmable read only

                                        memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                        both had the option of being programmed in batches in a factory or in the field

                                        (field programmable) however programmable logic was hard-wired between

                                        logic gates

                                        Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                        first commercially viable field programmable gate array in 1985 ndash the

                                        XC2064 The XC2064 had programmable gates and programmable

                                        interconnects between gates the beginnings of a new technology and market

                                        The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                        input lookup tables (LUTs) More than 20 years later Freeman was entered

                                        into the National Inventors Hall of Fame for his invention

                                        53

                                        Some of the industryrsquos foundational concepts and technologies for

                                        programmable logic arrays gates and logic blocks are founded in patents

                                        awarded to David W Page and LuVerne R Peterson in 1985

                                        In the late 1980s the Naval Surface Warfare Department funded an experiment

                                        proposed by Steve Casselman to develop a computer that would implement

                                        600000 reprogrammable gates Casselman was successful and the system was

                                        awarded a patent in 1992

                                        Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                        1990s when competitors sprouted up eroding significant market-share By

                                        1993 Actel was serving about 18 percent of the market

                                        The 1990s were an explosive period of time for FPGAs both in sophistication

                                        and the volume of production In the early 1990s FPGAs were primarily used

                                        in telecommunications and networking By the end of the decade FPGAs

                                        found their way into consumer automotive and industrial applications

                                        FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                        algorithm technology and FPGAs to create a sound recognition device

                                        Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                        to decide the configuration needed to accomplish a sound recognition task

                                        121 Modern developments

                                        A recent trend has been to take the coarse-grained architectural

                                        approach a step further by combining the logic blocks and interconnects of

                                        traditional FPGAs with embedded microprocessors and related peripherals to

                                        form a complete system on a programmable chip This work mirrors the

                                        architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                        Group which combined a reconfigurable CPU architecture on a single chip

                                        called the SB24 That work was done in 1982 Examples of such hybrid

                                        technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                        which include one or more PowerPC processors embedded within the FPGAs

                                        54

                                        logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                        processor in combination with Atmels programmable logic architecture

                                        An alternate approach to using hard-macro processors is to make use of soft

                                        processor cores that are implemented within the FPGA logic (See Soft

                                        processors below)

                                        As previously mentioned many modern FPGAs have the ability to be

                                        reprogrammed at run time and this is leading to the idea of reconfigurable

                                        computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                        suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                        example of a reconfigurable soft processor implemented on FPGAs However

                                        it does not support dynamic reconfiguration at runtime but instead adapts itself

                                        to a specific program

                                        Additionally new non-FPGA architectures are beginning to emerge

                                        Software-configurable microprocessors such as the Stretch S5000 adopt a

                                        hybrid approach by providing an array of processor cores and FPGA-like

                                        programmable cores on the same chip

                                        Gates

                                        1987 9000 gates Xilinx

                                        1992 600000 Naval Surface Warfare Department

                                        Early 2000s Millions

                                        Market size

                                        1985 First commercial FPGA technology invented by Xilinx

                                        1987 $14 million

                                        ~1993 gt$385 million

                                        2005 $19 billion

                                        2010 estimates $275 billion

                                        55

                                        122 FPGA Comparisons

                                        Historically FPGAs have been slower less energy efficient and generally

                                        achieved less functionality than their fixed ASIC counterparts A combination

                                        of volume fabrication improvements research and development and the IO

                                        capabilities of new supercomputers have largely closed the performance gap

                                        between ASICs and FPGAs

                                        Advantages include a shorter time to market ability to re-program in the field

                                        to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                        middle road by developing their hardware on ordinary FPGAs but

                                        manufacture their final version so it can no longer be modified after the design

                                        has been committed

                                        Xilinx claims that several market and technology dynamics are changing the

                                        ASICFPGA paradigm

                                        IC costs are rising aggressively

                                        ASIC complexity has bolstered development time and costs

                                        RampD resources and headcount is decreasing

                                        Revenue losses for slow time-to-market are increasing

                                        Financial constraints in a poor economy are driving low-cost technologies

                                        These trends make FPGAs a better alternative than ASICs for a growing

                                        number of higher-volume applications than they have been historically used

                                        for which the company blames for the growing number of FPGA design starts

                                        (see History)

                                        The primary differences between CPLDs and FPGAs are architectural A

                                        CPLD has a somewhat restrictive structure consisting of one or more

                                        programmable sum-of-products logic arrays feeding a relatively small number

                                        of clocked registers The result of this is less flexibility with the advantage of

                                        more predictable timing delays and a higher logic-to-interconnect ratio The

                                        FPGA architectures on the other hand are dominated by interconnect This

                                        56

                                        makes them far more flexible (in terms of the range of designs that are

                                        practical for implementation within them) but also far more complex to design

                                        for

                                        Another notable difference between CPLDs and FPGAs is the presence in most

                                        FPGAs of higher-level embedded functions (such as adders and multipliers)

                                        and embedded memories as well as to have logic blocks implement decoders

                                        or mathematical functions

                                        Some FPGAs have the capability of partial re-configuration that lets one

                                        portion of the device be re-programmed while other portions continue running

                                        123 Applications

                                        Applications of FPGAs include digital signal processing software-defined

                                        radio aerospace and defense systems ASIC prototyping medical imaging

                                        computer vision speech recognition cryptography bioinformatics computer

                                        hardware emulation radio astronomy and a growing range of other areas

                                        FPGAs originally began as competitors to CPLDs and competed in a similar

                                        space that of glue logic for PCBs As their size capabilities and speed

                                        increased they began to take over larger and larger functions to the state where

                                        some are now marketed as full systems on chips (SoC) Particularly with the

                                        introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                        applications which had traditionally been the sole reserve of DSPs began to

                                        incorporate FPGAs instead

                                        FPGAs especially find applications in any area or algorithm that can make use

                                        of the massive parallelism offered by their architecture One such area is code

                                        breaking in particular brute-force attack of cryptographic algorithms

                                        FPGAs are increasingly used in conventional high performance computing

                                        applications where computational kernels such as FFT or Convolution are

                                        performed on the FPGA instead of a microprocessor

                                        57

                                        The inherent parallelism of the logic resources on an FPGA allows for

                                        considerable computational throughput even at a low MHz clock rates The

                                        flexibility of the FPGA allows for even higher performance by trading off

                                        precision and range in the number format for an increased number of parallel

                                        arithmetic units This has driven a new type of processing called

                                        reconfigurable computing where time intensive tasks are offloaded from

                                        software to FPGAs

                                        The adoption of FPGAs in high performance computing is currently limited by

                                        the complexity of FPGA design compared to conventional software and the

                                        extremely long turn-around times of current design tools where 4-8 hours wait

                                        is necessary after even minor changes to the source code

                                        Traditionally FPGAs have been reserved for specific vertical applications

                                        where the volume of production is small For these low-volume applications

                                        the premium that companies pay in hardware costs per unit for a

                                        programmable chip is more affordable than the development resources spent

                                        on creating an ASIC for a low-volume application Today new cost and

                                        performance dynamics have broadened the range of viable applications

                                        Architecture

                                        The most common FPGA architecture consists of an array of configurable

                                        logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                        channels have the same width (number of wires) Multiple IO pads may fit

                                        into the height of one row or the width of one column in the array

                                        An application circuit must be mapped into an FPGA with adequate resources

                                        While the number of CLBs and IOs required is easily determined from the

                                        design the number of routing tracks needed may vary considerably even

                                        among designs with the same amount of logic (For example a crossbar switch

                                        requires much more routing than a systolic array with the same gate count)

                                        Since unused routing tracks increase the cost (and decrease the performance)

                                        of the part without providing any benefit FPGA manufacturers try to provide

                                        58

                                        just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                        can be routed This is determined by estimates such as those derived from

                                        Rents rule or by experiments with existing designs

                                        The FPGA is an array or island-style FPGA It consists of an array of logic

                                        blocks and routing channels Two IO pads fit into the height of one row or the

                                        width of one column as shown below All the routing channels have the same

                                        width (number of wires)

                                        121 FPGA structure

                                        A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                        flip-flop as shown below In recent years manufacturers have started moving

                                        to 6-input LUTs in their high performance parts claiming increased

                                        performance

                                        Typical logic block

                                        59

                                        There is only one output which can be either the registered or the unregistered

                                        LUT output The logic block has four inputs for the LUT and a clock input

                                        Since clock signals (and often other high-fanout signals) are normally routed

                                        via special-purpose dedicated routing networks in commercial FPGAs they

                                        and other signals are separately managed

                                        For this example architecture the locations of the FPGA logic block pins are

                                        shown below

                                        Logic Block Pin Locations

                                        Each input is accessible from one side of the logic block while the output pin

                                        can connect to routing wires in both the channel to the right and the channel

                                        below the logic block

                                        Each logic block output pin can connect to any of the wiring segments in the

                                        channels adjacent to it

                                        Similarly an IO pad can connect to any one of the wiring segments in the

                                        channel adjacent to it For example an IO pad at the top of the chip can

                                        connect to any of the W wires (where W is the channel width) in the horizontal

                                        channel immediately below it

                                        Generally the FPGA routing is unsegmented That is each wiring segment

                                        spans only one logic block before it terminates in a switch box By turning on

                                        60

                                        some of the programmable switches within a switch box longer paths can be

                                        constructed For higher speed interconnect some FPGA architectures use

                                        longer routing lines that span multiple logic blocks

                                        Whenever a vertical and a horizontal channel intersect there is a switch box In

                                        this architecture when a wire enters a switch box there are three

                                        programmable switches that allow it to connect to three other wires in adjacent

                                        channel segments The pattern or topology of switches used in this

                                        architecture is the planar or domain-based switch box topology In this switch

                                        box topology a wire in track number one connects only to wires in track

                                        number one in adjacent channel segments wires in track number 2 connect

                                        only to other wires in track number 2 and so on The figure below illustrates

                                        the connections in a switch box

                                        Switch box topology

                                        Modern FPGA families expand upon the above capabilities to include higher

                                        level functionality fixed into the silicon Having these common functions

                                        embedded into the silicon reduces the area required and gives those functions

                                        increased speed compared to building them from primitives Examples of these

                                        include multipliers generic DSP blocks embedded processors high speed IO

                                        logic and embedded memories

                                        61

                                        FPGAs are also widely used for systems validation including pre-silicon

                                        validation post-silicon validation and firmware development This allows chip

                                        companies to validate their design before the chip is produced in the factory

                                        reducing the time to market

                                        124 FPGA Design and Programming

                                        To define the behavior of the FPGA the user provides a hardware description

                                        language (HDL) or a schematic design The HDL form might be easier to work

                                        with when handling large structures because its possible to just specify them

                                        numerically rather than having to draw every piece by hand On the other hand

                                        schematic entry can allow for easier visualisation of a design

                                        Then using an electronic design automation tool a technology-mapped netlist

                                        is generated The netlist can then be fitted to the actual FPGA architecture

                                        using a process called place-and-route usually performed by the FPGA

                                        companys proprietary place-and-route software The user will validate the

                                        map place and route results via timing analysis simulation and other

                                        verification methodologies Once the design and validation process is

                                        complete the binary file generated (also using the FPGA companys

                                        proprietary software) is used to (re)configure the FPGA

                                        Going from schematicHDL source files to actual configuration The source

                                        files are fed to a software suite from the FPGACPLD vendor that through

                                        different steps will produce a file This file is then transferred to the

                                        FPGACPLD via a serial interface (JTAG) or to an external memory device

                                        like an EEPROM

                                        The most common HDLs are VHDL and Verilog although in an attempt to

                                        reduce the complexity of designing in HDLs which have been compared to the

                                        equivalent of assembly languages there are moves to raise the abstraction level

                                        through the introduction of alternative languages

                                        62

                                        To simplify the design of complex systems in FPGAs there exist libraries of

                                        predefined complex functions and circuits that have been tested and optimized

                                        to speed up the design process These predefined circuits are commonly called

                                        IP cores and are available from FPGA vendors and third-party IP suppliers

                                        (rarely free and typically released under proprietary licenses) Other

                                        predefined circuits are available from developer communities such as Open

                                        Cores (typically free and released under the GPL BSD or similar license) and

                                        other sources

                                        In a typical design flow an FPGA application developer will simulate the

                                        design at multiple stages throughout the design process Initially the RTL

                                        description in VHDL or Verilog is simulated by creating test benches to

                                        simulate the system and observe results Then after the synthesis engine has

                                        mapped the design to a netlist the netlist is translated to a gate level

                                        description where simulation is repeated to confirm the synthesis proceeded

                                        without errors Finally the design is laid out in the FPGA at which point

                                        propagation delays can be added and the simulation run again with these

                                        values back-annotated onto the netlist

                                        Basic Process Technology Types

                                        SRAM - based on static memory technology In-system programmable and re-

                                        programmable Requires external boot devices CMOS

                                        Antifuse - One-time programmable CMOS

                                        EPROM - Erasable Programmable Read-Only Memory technology Usually

                                        one-time programmable in production because of plastic packaging

                                        Windowed devices can be erased with ultraviolet (UV) light CMOS

                                        EEPROM - Electrically Erasable Programmable Read-Only Memory

                                        technology Can be erased even in plastic packages Some but not all

                                        EEPROM devices can be in-system programmed CMOS

                                        Flash - Flash-erase EPROM technology Can be erased even in plastic

                                        packages Some but not all flash devices can be in-system programmed

                                        63

                                        Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                        therefore less expensive to manufacture CMOS

                                        Fuse - One-time programmable Bipolar

                                        Major Manufacturers

                                        Xilinx and Altera are the current FPGA market leaders and long-time industry

                                        rivals Together they control over 80 percent of the market with Xilinx alone

                                        representing over 50 percent

                                        Xilinx also provides free Windows and Linux design software while Altera

                                        provides free Windows tools the Solaris and Linux tools are only available via

                                        a rental scheme

                                        Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                        (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                        Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                        focused CSSP no general purpose FPGAs)

                                        125 FPGA prototype

                                        FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                        prototyping is the method to prototype SoC and ASIC design on FPGA for

                                        hardware verification and early software development

                                        Main stream verification methods for hardware design and early software and

                                        firmware co-design has become mainstream Prototyping SoC and ASIC

                                        design on FPGA has become a good method to do this

                                        64

                                        Reasons why Prototyping is important

                                        1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                        functionally correct This is compared to designers only relying on software

                                        simulations to verify that their hardware design is sound Simulation speed and

                                        modeling accuracy limitations hinder this development

                                        2 Due to time constrains many projects cannot wait until the silicon is back from

                                        the foundry to start on software tests FPGA prototyping allows for much more

                                        time in area of software development and testing at the software-hardware

                                        integration stage This allows many unforeseen software bugs that appear due

                                        to todays array of operating systems applications and hardware

                                        3 Prototyping also allows the developer to ensure that all IP technologies on his

                                        system work well together off the simulation stage and in actual form

                                        4 Prototyping has the added advantage as demo platforms to SoC clients

                                        bringing in interest early This speeds up the overall development cycle and

                                        allows for more enhancement or improvement to the chip features as it would

                                        otherwise have been

                                        65

                                        13 SIMULATION RESULTS

                                        66

                                        GATE LEVEL

                                        FigVGA CONTROLLER

                                        67

                                        GATE LEVEL

                                        Fig SYNC UNIT

                                        68

                                        TECHNOLOGY SCHEMATIC

                                        69

                                        14 APPLICATIONS

                                        Motion Tracking Systems

                                        Image processing systems

                                        Displaying Systems

                                        141 CONCLUTION

                                        Hence by implementing FPGA taking a processor-generated picture

                                        (pixilated) from memory space the Controller provides digital RGB values for

                                        each pixel as well as horizontal and vertical synchronization signals in order

                                        to correctly display the picture on a connected monitor

                                        70

                                        REFERENCES

                                        1 httpenwikipediaorgwikiVideo_Graphics_Array

                                        2 A VGA display controller by Eduardo Sanchez

                                        3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                        71

                                        • Pixel
                                          • 6 SIGNALS IN VGA CONTROLLER
                                          • Gates
                                          • Market size
                                          • 122 FPGA Comparisons

                                          9 VGA SIGNAL TIMING

                                          Figure 3 summarizes the signal timing involved in sending a line of

                                          pixels and a frame of lines The actual time values differ according to the

                                          resolution selected (640x480 or 800x600) the processor-defined values for

                                          DISPSIZE_H and DISPSIZE_V and the frequency of the external clock signal

                                          (CLK)

                                          21

                                          91 VGA PIXEL UNIT

                                          The Pixel Unit provides access to the pixilated image stored in external

                                          picture memory reading in data a byte at a time and formatting each pixel to

                                          be displayed For each pixel the 6-bit RGB value required for the control of

                                          the monitorrsquos three primary color electron guns is generated so that the pixel

                                          is displayed on the screen with the correct color

                                          Accessing the stored image

                                          The image to be displayed on the monitor screen is written by the host

                                          microcontroller into external memory space (RAM) This memory space can

                                          be located anywhere (eg using a block of RAM within the FPGA design or

                                          using the SRAM on the NanoBoard itself)

                                          Pixel data is stored in the memory space in bytes The number of pixels in a

                                          byte depends on which color palette is being used for the image (selected by

                                          the CMOD input)

                                          bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                                          bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                                          bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                                          (DATA[76] are not used)

                                          22

                                          The size of memory required to store a picture is determined by the total

                                          number of viewable pixels in a line (determined by DISPSIZE_H) the total

                                          number of viewable lines in a frame (determined by DISPSIZE_V) and the

                                          number of pixels stored in each byte in memory space

                                          Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                                          The address in RAM where the next pixel is stored is determined using

                                          an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                                          reflects the current contents of the MEMADDR register The exact addressing

                                          is described below

                                          92 Black amp White mode

                                          The picture memory address ndash the byte of data containing the next 8 pixels

                                          ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                                          three

                                          ADDR_PIXEL = 00 amp hvcnt[183]

                                          16 Colors mode

                                          The picture memory address ndash the byte of data containing the next 2 pixels

                                          ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                                          one

                                          ADDR_PIXEL = 0 amp hvcnt[181]

                                          64 Colors mode

                                          The picture memory address ndash the byte of data containing the next pixel ndash is

                                          determined by using the full value of hvcnt

                                          ADDR_PIXEL = hvcnt[180]

                                          The Pixel register (PIXREG)

                                          The Pixel register is used to receive the byte of pixel data read from the

                                          current address in memory space The register PIXREG is reset to zero when

                                          the VGA Controller receives an external reset signal (RST)

                                          23

                                          The register is updated on each rising edge of the CLK signal Data can be

                                          read from the memory space as long as the RD signal is active (High) The RD

                                          signal is itself controlled by the external line display enable signal enex This

                                          internally generated signal is defined as follows

                                          If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                                          (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                                          Else

                                          enex = 1

                                          When enex is Low read access from memory is disabled (RD = 0)

                                          The point at which data is loaded from memory into PIXREG depends on the

                                          particular color palette that is chosen ndash Black amp White 16 Colors or 64

                                          Colors

                                          Black amp White mode

                                          The next byte of data will be loaded into the register whenever the

                                          lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                                          are ldquo000rdquo

                                          For the currently loaded byte the active pixel is always in the lowest bit

                                          position of the Pixel register Each pixel in the data byte is moved into this

                                          active pixel position by shifting the contents of the register right by one bit on

                                          each rising edge of CLK

                                          16 Colors mode

                                          The next byte of data will be loaded into the register whenever the

                                          lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                                          0

                                          For the currently loaded byte the active pixel is always in the low order

                                          nibble of the Pixel register Remember that in this mode each byte of data

                                          contains two pixels The second pixel is moved into this active pixel position

                                          by shifting the contents of the register right by four bits on the rising edge of

                                          CLK

                                          24

                                          64 Colors mode

                                          The next byte of data will be loaded into the register on the rising edge

                                          of the external system clock signal (CLK) In this mode the read of pixel data

                                          does not depend on the status of the hvcnt signal received from the

                                          Synchronization Unit

                                          93 The RGB register (RGBREG)

                                          The RGB register is used to store the six bits that are required for

                                          driving the red green and blue color guns of the target monitor When the

                                          chosen color palette is either Black amp White or 16 Colors these six bits are

                                          obtained by mapping the value of the active pixel to a predefined RGB code

                                          When the chosen color palette is 64 Colors the actual pixel value is used

                                          directly

                                          The register RGBREG is reset to zero (000000) when the VGA

                                          Controller receives an external reset signal (RST) This RGB code represents

                                          black

                                          The register is updated on each rising edge of the CLK signal and the

                                          value loaded is dependent on the state of the line display enable signal en

                                          When en is Low blanking is required and RGBREG is loaded with the code

                                          for black (000000)

                                          Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                          and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                          25

                                          26

                                          27

                                          The RGB color code stored in the RGB register is output from the VGA

                                          Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                          and B1)

                                          The monitor itself expects analog signals as inputs to its electron gun control

                                          circuits This is achieved by using 2-bit digital to analog converter circuitry

                                          located on the Nano Board itself as shown in Figure 4

                                          Figure 91 Digital to Analog RGB conversion

                                          28

                                          For each color the 2-bit digital signal from the VGA Controller can be

                                          converted into 4 distinct analog levels These levels specify the intensity of

                                          each of the three primary colors to use when displaying the pixel on the

                                          monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                          brightness)

                                          With each analog input being one of four possible levels the monitor

                                          can display each pixel on the screen with one of 64 different color

                                          permutations

                                          Using Block RAM in Spartan-3E Generation

                                          FPGAs

                                          For applications requiring large on-chip memories Spartantrade-3

                                          Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                          Using various configuration options Select RAM blocks create RAM ROM

                                          FIFOs large look-up tables data width converterscircular buffers and shift

                                          registers each supporting various data widths and depths This application note

                                          describes the features and capabilities of block Select RAM and illustrates how

                                          to specify the various options using the Xilinx CORE Generatortrade system or

                                          via VHDL or Verilog instantiation

                                          29

                                          Table 2 Block RAM Available in Spartan-3E Devices

                                          Devi

                                          ce

                                          R

                                          A

                                          M

                                          Co

                                          lu

                                          m

                                          ns

                                          R

                                          A

                                          M

                                          Bl

                                          oc

                                          ks

                                          pe

                                          r

                                          C

                                          ol

                                          u

                                          m

                                          n

                                          T

                                          o

                                          t

                                          a

                                          l

                                          R

                                          A

                                          M

                                          B

                                          l

                                          o

                                          c

                                          k

                                          s

                                          T

                                          o

                                          t

                                          a

                                          l

                                          R

                                          A

                                          M

                                          B

                                          it

                                          s

                                          T

                                          o

                                          t

                                          a

                                          l

                                          R

                                          A

                                          M

                                          K

                                          b

                                          i

                                          t

                                          s

                                          XC3

                                          S100

                                          E

                                          1 4 4 7

                                          3

                                          7

                                          2

                                          8

                                          7

                                          2

                                          XC3

                                          S250

                                          E

                                          2 6 1

                                          2

                                          2

                                          2

                                          1

                                          1

                                          8

                                          4

                                          2

                                          1

                                          6

                                          XC3

                                          S500

                                          E

                                          2 10 2

                                          0

                                          3

                                          6

                                          8

                                          6

                                          4

                                          3

                                          6

                                          0

                                          30

                                          0

                                          XC3

                                          S120

                                          0E

                                          2 14 2

                                          8

                                          5

                                          1

                                          6

                                          0

                                          9

                                          6

                                          5

                                          0

                                          4

                                          XC3

                                          S160

                                          0E

                                          2 18 3

                                          6

                                          6

                                          6

                                          3

                                          5

                                          5

                                          2

                                          6

                                          4

                                          8

                                          Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                          which is allocated to data storage and in some memory configurations an

                                          additional 2K bits allocated to parity or additional plus data bits Physically

                                          the block RAM memory has two completely independent access ports labeled

                                          Port A and Port B The structure is fully symmetrical and both ports are

                                          interchangeable and both ports support data read and write operations Each

                                          memory port is synchronous with its own clock clock enable and write

                                          enable Read operations are also synchronous and require a clock edge and

                                          clock enable Though physically a dual-port memory block RAM simulates

                                          single-port memory in an application as shown in Figure 1 Furthermore each

                                          block memory supports multiple configurations or aspect ratios Table 3

                                          summarizes the essential SelectRAM features Cascade multiple block RAMs

                                          to create deeper and wider memory organizations with a minimal timing

                                          penalty incurred through specialized routing resources

                                          31

                                          32

                                          10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                          Why (V) HDL

                                          Interoperability

                                          Technology independence

                                          Design reuse

                                          Several levels of abstraction

                                          Readability

                                          Standard language

                                          Widely supported

                                          What is VHDL

                                          VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                          IC)

                                          Design specification language

                                          Design entry language

                                          Design simulation language

                                          Design documentation language

                                          An alternative to schematics

                                          101 Brief History

                                          VHDL Was developed in the early 1980s for managing design problems that

                                          involved large circuits and multiple teams of engineers

                                          Funded by US Department of Defence

                                          33

                                          The first publicly available version was released in 1985

                                          In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                          presented with a proposal to standardize the VHDL

                                          In 1987 standardization =gt IEEE 1076-1987

                                          An improved version of the language was released in 1994 =gt IEEE

                                          standard1076-1993

                                          102 Related Standards

                                          IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                          impedance

                                          Soon after IEEE 1076-1987 was released simulator companies began using

                                          their own non-standard types =gt VHDL was becoming a nonstandard

                                          IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                          definitions for a nine-valued data type std_logic

                                          IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                          to actual hardware

                                          Defines eg two numeric types signed and unsigned

                                          VHDL Environment

                                          Design Units

                                          34

                                          Segments of VHDL code that can be compiled separately and stored in a

                                          library

                                          Entities

                                          A black box with interface definition

                                          Defines the inputsoutputs of a component (define pins)

                                          A way to represent modularity in VHDL

                                          Similar to symbol in schematic

                                          Entity declaration describes entity

                                          Eg

                                          entity Comparator is

                                          port (A B in std_logic_vector(7 downto0)

                                          EQ out std_logic)

                                          end Comparator

                                          Ports Provide channels of communication between the component and its

                                          environment

                                          Each port must have a name direction and a type

                                          An entity may have NO port declaration

                                          35

                                          Port directions In A value of a port can be read inside the component but cannot be assigned

                                          Multiple reads of port are allowed

                                          Out Assignments can be made to a port but data from a port cannot be read

                                          Multiple assignments are allowed

                                          In out Bi-directional assignments can be made and data can be read Multiple

                                          assignments are allowed

                                          Buffer An out port with read capability May have at most one assignment

                                          (are not recommended)

                                          Architectures Every entity has at least one architecture

                                          One entity can have several architectures

                                          Architectures can describe design using

                                          BehaviorndashStructurendashDataflow

                                          Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                          Transfer Level)ndashBehavioral level

                                          Configuration declaration links architecture to entity

                                          Eg

                                          Architecture Comparator1 of Comparator is

                                          Begin

                                          EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                          End Comparator1

                                          Configurations Links entity declaration and architecture body together

                                          Concept of default configuration is a bit messy in VHDL lsquo87

                                          ndashLast architecture analyzed links to entity

                                          Can be used to change simulation behavior without re-analyzing the VHDL

                                          source

                                          Complex configuration declarations are ignored in synthesis

                                          36

                                          Some entities can have eggate level architecture and behavioral

                                          architecture

                                          Are always optional

                                          PackagesPackages contain information common to many design units

                                          1 Package declaration

                                          --constant declarations

                                          ndashtype and subtype declarations

                                          ndashfunction and procedure declarations

                                          ndashglobal signal declarations

                                          ndashfile declarations

                                          ndashcomponent declarations

                                          2 Package body

                                          ndashis not necessary needed

                                          ndashfunction bodies

                                          ndashprocedure bodies

                                          Packages are meant for encapsuling data which can be shared globally among

                                          Several design units

                                          Consists of declaration part and optional body part

                                          Package declaration can contain

                                          ndashtype and subtype declarations

                                          ndashsubprograms

                                          ndashconstants

                                          Alias declarations

                                          ndashglobal signal declarations

                                          ndashfile declarations

                                          ndashcomponent declarations

                                          37

                                          Package body consists of

                                          ndashsubprogram declarations and bodies

                                          ndashtype and subtype declarations

                                          ndash deferred constants

                                          ndash file declarations

                                          LibrariesCollection of VHDL design units (database)

                                          1 Packages

                                          package declaration

                                          package body

                                          2 Entities (entity declaration)

                                          3 Architectures (architecture body)

                                          4 Configurations (configuration declarations)

                                          Usually directory in UNIX file system

                                          Can be also any other kind of database

                                          Levels of Abstraction

                                          VHDL supports many possible styles of design description which differ

                                          primarily in how closely they relate to the HW

                                          It is possible to describe a circuit in a number of ways

                                          Structural-------

                                          Dataflow ------- Higher level of abstraction

                                          Behavioral -------

                                          Structural VHDL description

                                          Circuit is described in terms of its components

                                          From a low-level description (eg transistor-level description) to a high level

                                          description (eg block diagram)

                                          For large circuits a low-level description quickly becomes impractical

                                          103 Dataflow VHDL Description

                                          38

                                          Circuit is described in terms of how data moves through the system

                                          In the dataflow style you describe how information flows between registers in

                                          the system

                                          The combinational logic is described at a relatively high level the placement

                                          and operation of registers is specified quite precisely

                                          The behavior of the system over the time is defined by registers

                                          There are no build-in registers in VHDL-language

                                          ndashEither lower level description

                                          ndashor behavioral description of sequential elements is needed

                                          The lower level register descriptions must be created or obtained

                                          If there is no 3rd party models for registers =gt you must write the behavioral

                                          description of registers

                                          The behavioral description can be provided in the form of

                                          subprograms(functions or procedures)

                                          104 Behavioral VHDL Description

                                          Circuit is described in terms of its operation over time

                                          39

                                          Representation might include eg state diagrams timing diagrams and

                                          algorithmic descriptions

                                          The concept of time may be expressed precisely using delays (eg A lt= B

                                          after 10 ns)

                                          If no actual delay is used order of sequential operations is defined

                                          In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                          timing specifications

                                          The actual timing results depend on implementation technology and efficiency

                                          of synthesis tool

                                          There are a few tools for behavioral synthesis

                                          Concurrent Vs Sequential

                                          Processes

                                          Basic simulation concept in VHDL

                                          VHDL description can always be broken up to interconnected processes

                                          Quite similar to Unix process

                                          40

                                          Process keyword in VHDL

                                          Process statement is concurrent statement

                                          Statements inside process statements are sequential statements

                                          Process must contain either sensitivity list or wait statement(s) but NOT both

                                          Sensitivity list or wait statement(s) contains signals which wakes process up

                                          General format

                                          Process [(sensitivity_list)]

                                          process_declarative_part

                                          Begin

                                          process_statements

                                          [wait_statement]

                                          End process

                                          41

                                          11 VGA CONTROLLER CODE

                                          library IEEE

                                          use IEEESTD_LOGIC_1164ALL

                                          use IEEESTD_LOGIC_ARITHALL

                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                          use IEEEnumeric_stdALL

                                          ---- Uncomment the following library declaration if instantiating

                                          ---- any Xilinx primitives in this code

                                          --library UNISIM

                                          --use UNISIMVComponentsall

                                          entity vga_controller is port(clk50min std_logic

                                          rstnin std_logic

                                          hsyncout std_logic

                                          vsyncout std_logic

                                          redout std_logic_vector(0 downto 0)

                                          greenout std_logic_vector(0 downto 0)

                                          blueout std_logic_vector(0 downto 0))

                                          end vga_controller

                                          architecture Behavioral of vga_controller is

                                          component memory

                                          port (

                                          addr IN std_logic_VECTOR(17 downto 0)

                                          clk IN std_logic

                                          dout OUT std_logic_VECTOR(0 downto 0)

                                          en IN std_logic

                                          sinit IN std_logic)

                                          END component

                                          42

                                          component sync_unit port(

                                          clkin std_logic

                                          rstnin std_logic

                                          hsyncout std_logic

                                          ---horizontal synch pulse(row)

                                          vsyncout std_logic

                                          --vertical sync pulse(frame) 111110100000000000

                                          v_en_regout std_logic

                                          h_enout std_logic

                                          v_enout std_logic

                                          line_countinout integer

                                          c_hsinout integer)

                                          end component

                                          component pixelgen port(

                                          clkin std_logic

                                          rstnin std_logic

                                          hsyncin std_logic

                                          vsyncin std_logic

                                          h_enin std_logic

                                          v_enin std_logic

                                          v_en_reg in std_logic

                                          data_inin std_logic_vector(0 downto 0)

                                          line_countin integer

                                          c_hsin integer

                                          red out std_logic_vector(0 downto 0)

                                          greenout std_logic_vector(0 downto 0)

                                          blueout std_logic_vector(0 downto 0)

                                          addrout std_logic_vector(17 downto 0)

                                          rdout std_logic)

                                          end component

                                          43

                                          signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                          signal line_count_sc_hs_saddrinteger

                                          signal data_sstd_logic_vector(0 downto 0)

                                          signal addr_sstd_logic_vector(17 downto 0)

                                          begin

                                          --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                          process(clk50mrstn)

                                          begin

                                          if(rstn = 0)then

                                          clk lt= 0

                                          elsif(clk50m = 1 and clk50mevent)then

                                          clk lt= not clk

                                          end if

                                          --end if

                                          end process

                                          hsync lt= hsync_s

                                          vsync lt= vsync_s

                                          sync sync_unit port map(

                                          clk =gt clk

                                          rstn =gt rstn

                                          hsync =gt hsync_s

                                          vsync =gt vsync_s

                                          v_en_reg =gt v_en_reg_s

                                          line_count =gt line_count_s

                                          h_en =gt h_en_s

                                          v_en =gt v_en_s

                                          c_hs =gt c_hs_s

                                          )

                                          44

                                          pixel pixelgen port map(

                                          clk =gt clk

                                          rstn =gt rstn

                                          hsync =gt hsync_s

                                          vsync =gt vsync_s

                                          h_en =gt h_en_s

                                          v_en =gt v_en_s

                                          v_en_reg =gt v_en_reg_s

                                          data_in =gt data_s

                                          line_count=gt line_count_s

                                          c_hs =gt c_hs_s

                                          red =gt red

                                          green =gt green

                                          blue =gt blue

                                          addr =gt addr_s

                                          rd =gt rd_s

                                          )

                                          mem12 memory port map(

                                          addr =gt addr_s

                                          clk =gt clk

                                          dout =gt data_s

                                          en =gt rd_s

                                          sinit=gt rstn

                                          )

                                          end Behavioral

                                          45

                                          SYNCHRONIZATION BLOCK CODE

                                          library IEEE

                                          use IEEESTD_LOGIC_1164ALL

                                          use IEEESTD_LOGIC_ARITHALL

                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                          entity sync_unit is port(

                                          clkin std_logic

                                          rstnin std_logic

                                          hsyncout std_logic ---horizontal synch pulse(row)

                                          vsyncout std_logic --vertical sync

                                          pulse(frame) 111110100000000000

                                          v_en_regout std_logic

                                          h_enout std_logic

                                          v_enout std_logic

                                          line_countout integer

                                          c_hsout integer )

                                          end sync_unit

                                          architecture arch_sync_unit of sync_unit is

                                          signal h_en_sv_en_sstd_logic

                                          signal line_count_sc_hs_sinteger

                                          begin

                                          c_hs lt= c_hs_s

                                          line_count lt= line_count_s

                                          h_en lt= h_en_s

                                          v_en lt= v_en_s

                                          -- hsync counter

                                          process(clkrstn)

                                          begin

                                          if(rstn = 0)then

                                          46

                                          c_hs_s lt= 0

                                          else

                                          if(clk = 1 and clkevent)then

                                          if(c_hs_s=793)then

                                          c_hs_s lt= 0

                                          else

                                          c_hs_s lt= c_hs_s +1

                                          end if

                                          end if

                                          end if

                                          end process

                                          --vsync counter

                                          --vertical line counter

                                          process(h_en_srstn)

                                          begin

                                          if(rstn = 0)then

                                          line_count_s lt= 0

                                          elsif(h_en_s = 1 and h_en_sevent)then

                                          line_count_s lt= line_count_s + 1

                                          if (line_count_s=515)then

                                          line_count_s lt= 0

                                          end if

                                          end if

                                          --end if

                                          end process

                                          --hysnc pulse

                                          process(clkrstn)

                                          begin

                                          if(rstn = 0)then

                                          47

                                          hsync lt= 1

                                          elsif(clk = 1 and clkevent)then

                                          if(c_hs_s lt=95)then

                                          hsync lt= 0

                                          else

                                          hsync lt= 1

                                          end if

                                          end if

                                          --end if

                                          end process

                                          process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                          this period)

                                          begin

                                          if(rstn = 0)then

                                          h_en_s lt= 0

                                          elsif(clk = 1 and clkevent)then

                                          if((c_hs_s gt=143) and (c_hs_slt=778))then

                                          h_en_s lt= 1

                                          else

                                          h_en_s lt= 0

                                          end if

                                          end if

                                          --end if

                                          end process

                                          ---vysnc pulse

                                          process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                          begin

                                          if(rstn = 0)then

                                          48

                                          v_en_s lt= 0

                                          elsif(clk = 1 and clkevent)then

                                          if((line_count_sgt=35 and line_count_slt=515))then

                                          v_en_s lt= 1

                                          else

                                          v_en_s lt= 0

                                          end if

                                          end if

                                          --end if

                                          end process

                                          process(clkrstn) --Vertical Sync Front porch Back Porch

                                          begin

                                          if(rstn = 0)then

                                          vsync lt= 1

                                          elsif(clk = 1 and clkevent)then

                                          if(line_count_s lt2)then

                                          vsync lt= 0

                                          else

                                          vsync lt= 1

                                          end if

                                          end if

                                          --end if

                                          end process

                                          process(clk)

                                          begin

                                          if(clk = 1 and clkevent)then

                                          v_en_reglt=v_en_s

                                          end if

                                          end process

                                          end arch_sync_unit

                                          49

                                          PIXEL GENERATOR

                                          library IEEE

                                          use IEEESTD_LOGIC_1164ALL

                                          use IEEESTD_LOGIC_ARITHALL

                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                          entity pixelgen is port(

                                          clkin std_logic

                                          rstnin std_logic

                                          hsyncin std_logic

                                          vsyncin std_logic

                                          h_enin std_logic

                                          v_enin std_logic

                                          v_en_reg in std_logic

                                          data_inin std_logic_vector(0 downto 0)

                                          line_countin integer

                                          c_hsin integer

                                          red out std_logic_vector(0 downto 0)

                                          greenout std_logic_vector(0 downto 0)

                                          blueout std_logic_vector(0 downto 0)

                                          addrout std_logic_vector(17 downto 0)

                                          rdout std_logic

                                          )

                                          end pixelgen

                                          architecture behv of pixelgen is

                                          signal addr_sstd_logic_vector(17 downto 0)

                                          begin

                                          addr lt= addr_s

                                          process(clkrstn)

                                          begin

                                          50

                                          if(rstn = 0)then

                                          rd lt= 0

                                          red lt= 0

                                          green lt= 0

                                          blue lt= 0

                                          addr_s lt= x0000 amp 00

                                          elsif (clk = 1 and clkevent)then

                                          if((v_en=1) and (v_en_reg=0))then

                                          addr_slt=(others =gt0)

                                          end if

                                          if(v_en=1 and h_en=1)then

                                          if(line_countlt=514)then

                                          if(c_hs lt= 655)then

                                          rd lt= 1

                                          red lt= data_in

                                          green lt= data_in

                                          blue lt= data_in

                                          addr_s lt=addr_s +000000000000000001

                                          else

                                          rd lt= 0

                                          red lt= 0

                                          green lt= 1

                                          blue lt= 0

                                          end if

                                          else

                                          rd lt= 0

                                          red lt= 1

                                          green lt= 1

                                          blue lt= 1

                                          51

                                          end if

                                          if(addr_s=111011111111111111)then

                                          addr_slt= (others =gt0)

                                          end if

                                          else

                                          red lt= 0

                                          green lt= 0

                                          blue lt= 0

                                          end if

                                          end if

                                          -- end if

                                          --end if

                                          end process

                                          end behv

                                          52

                                          12 FPGA ARCHITECTURE

                                          A field-programmable gate array (FPGA) is a semiconductor device that can

                                          be configured by the customer or designer after manufacturingmdashhence the

                                          name field-programmable FPGAs are programmed using a logic circuit

                                          diagram or a source code in a hardware description language (HDL) to specify

                                          how the chip will work They can be used to implement any logical function

                                          that an application-specific integrated circuit (ASIC) could perform but the

                                          ability to update the functionality after shipping offers advantages for many

                                          applications

                                          FPGAs contain programmable logic components called logic blocks and a

                                          hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                          togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                          can be configured to perform complex combinational functions or merely

                                          simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                          include memory elements which may be simple flip-flops or more complete

                                          blocks of memory

                                          History The FPGA industry sprouted from programmable read only

                                          memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                          both had the option of being programmed in batches in a factory or in the field

                                          (field programmable) however programmable logic was hard-wired between

                                          logic gates

                                          Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                          first commercially viable field programmable gate array in 1985 ndash the

                                          XC2064 The XC2064 had programmable gates and programmable

                                          interconnects between gates the beginnings of a new technology and market

                                          The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                          input lookup tables (LUTs) More than 20 years later Freeman was entered

                                          into the National Inventors Hall of Fame for his invention

                                          53

                                          Some of the industryrsquos foundational concepts and technologies for

                                          programmable logic arrays gates and logic blocks are founded in patents

                                          awarded to David W Page and LuVerne R Peterson in 1985

                                          In the late 1980s the Naval Surface Warfare Department funded an experiment

                                          proposed by Steve Casselman to develop a computer that would implement

                                          600000 reprogrammable gates Casselman was successful and the system was

                                          awarded a patent in 1992

                                          Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                          1990s when competitors sprouted up eroding significant market-share By

                                          1993 Actel was serving about 18 percent of the market

                                          The 1990s were an explosive period of time for FPGAs both in sophistication

                                          and the volume of production In the early 1990s FPGAs were primarily used

                                          in telecommunications and networking By the end of the decade FPGAs

                                          found their way into consumer automotive and industrial applications

                                          FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                          algorithm technology and FPGAs to create a sound recognition device

                                          Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                          to decide the configuration needed to accomplish a sound recognition task

                                          121 Modern developments

                                          A recent trend has been to take the coarse-grained architectural

                                          approach a step further by combining the logic blocks and interconnects of

                                          traditional FPGAs with embedded microprocessors and related peripherals to

                                          form a complete system on a programmable chip This work mirrors the

                                          architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                          Group which combined a reconfigurable CPU architecture on a single chip

                                          called the SB24 That work was done in 1982 Examples of such hybrid

                                          technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                          which include one or more PowerPC processors embedded within the FPGAs

                                          54

                                          logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                          processor in combination with Atmels programmable logic architecture

                                          An alternate approach to using hard-macro processors is to make use of soft

                                          processor cores that are implemented within the FPGA logic (See Soft

                                          processors below)

                                          As previously mentioned many modern FPGAs have the ability to be

                                          reprogrammed at run time and this is leading to the idea of reconfigurable

                                          computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                          suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                          example of a reconfigurable soft processor implemented on FPGAs However

                                          it does not support dynamic reconfiguration at runtime but instead adapts itself

                                          to a specific program

                                          Additionally new non-FPGA architectures are beginning to emerge

                                          Software-configurable microprocessors such as the Stretch S5000 adopt a

                                          hybrid approach by providing an array of processor cores and FPGA-like

                                          programmable cores on the same chip

                                          Gates

                                          1987 9000 gates Xilinx

                                          1992 600000 Naval Surface Warfare Department

                                          Early 2000s Millions

                                          Market size

                                          1985 First commercial FPGA technology invented by Xilinx

                                          1987 $14 million

                                          ~1993 gt$385 million

                                          2005 $19 billion

                                          2010 estimates $275 billion

                                          55

                                          122 FPGA Comparisons

                                          Historically FPGAs have been slower less energy efficient and generally

                                          achieved less functionality than their fixed ASIC counterparts A combination

                                          of volume fabrication improvements research and development and the IO

                                          capabilities of new supercomputers have largely closed the performance gap

                                          between ASICs and FPGAs

                                          Advantages include a shorter time to market ability to re-program in the field

                                          to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                          middle road by developing their hardware on ordinary FPGAs but

                                          manufacture their final version so it can no longer be modified after the design

                                          has been committed

                                          Xilinx claims that several market and technology dynamics are changing the

                                          ASICFPGA paradigm

                                          IC costs are rising aggressively

                                          ASIC complexity has bolstered development time and costs

                                          RampD resources and headcount is decreasing

                                          Revenue losses for slow time-to-market are increasing

                                          Financial constraints in a poor economy are driving low-cost technologies

                                          These trends make FPGAs a better alternative than ASICs for a growing

                                          number of higher-volume applications than they have been historically used

                                          for which the company blames for the growing number of FPGA design starts

                                          (see History)

                                          The primary differences between CPLDs and FPGAs are architectural A

                                          CPLD has a somewhat restrictive structure consisting of one or more

                                          programmable sum-of-products logic arrays feeding a relatively small number

                                          of clocked registers The result of this is less flexibility with the advantage of

                                          more predictable timing delays and a higher logic-to-interconnect ratio The

                                          FPGA architectures on the other hand are dominated by interconnect This

                                          56

                                          makes them far more flexible (in terms of the range of designs that are

                                          practical for implementation within them) but also far more complex to design

                                          for

                                          Another notable difference between CPLDs and FPGAs is the presence in most

                                          FPGAs of higher-level embedded functions (such as adders and multipliers)

                                          and embedded memories as well as to have logic blocks implement decoders

                                          or mathematical functions

                                          Some FPGAs have the capability of partial re-configuration that lets one

                                          portion of the device be re-programmed while other portions continue running

                                          123 Applications

                                          Applications of FPGAs include digital signal processing software-defined

                                          radio aerospace and defense systems ASIC prototyping medical imaging

                                          computer vision speech recognition cryptography bioinformatics computer

                                          hardware emulation radio astronomy and a growing range of other areas

                                          FPGAs originally began as competitors to CPLDs and competed in a similar

                                          space that of glue logic for PCBs As their size capabilities and speed

                                          increased they began to take over larger and larger functions to the state where

                                          some are now marketed as full systems on chips (SoC) Particularly with the

                                          introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                          applications which had traditionally been the sole reserve of DSPs began to

                                          incorporate FPGAs instead

                                          FPGAs especially find applications in any area or algorithm that can make use

                                          of the massive parallelism offered by their architecture One such area is code

                                          breaking in particular brute-force attack of cryptographic algorithms

                                          FPGAs are increasingly used in conventional high performance computing

                                          applications where computational kernels such as FFT or Convolution are

                                          performed on the FPGA instead of a microprocessor

                                          57

                                          The inherent parallelism of the logic resources on an FPGA allows for

                                          considerable computational throughput even at a low MHz clock rates The

                                          flexibility of the FPGA allows for even higher performance by trading off

                                          precision and range in the number format for an increased number of parallel

                                          arithmetic units This has driven a new type of processing called

                                          reconfigurable computing where time intensive tasks are offloaded from

                                          software to FPGAs

                                          The adoption of FPGAs in high performance computing is currently limited by

                                          the complexity of FPGA design compared to conventional software and the

                                          extremely long turn-around times of current design tools where 4-8 hours wait

                                          is necessary after even minor changes to the source code

                                          Traditionally FPGAs have been reserved for specific vertical applications

                                          where the volume of production is small For these low-volume applications

                                          the premium that companies pay in hardware costs per unit for a

                                          programmable chip is more affordable than the development resources spent

                                          on creating an ASIC for a low-volume application Today new cost and

                                          performance dynamics have broadened the range of viable applications

                                          Architecture

                                          The most common FPGA architecture consists of an array of configurable

                                          logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                          channels have the same width (number of wires) Multiple IO pads may fit

                                          into the height of one row or the width of one column in the array

                                          An application circuit must be mapped into an FPGA with adequate resources

                                          While the number of CLBs and IOs required is easily determined from the

                                          design the number of routing tracks needed may vary considerably even

                                          among designs with the same amount of logic (For example a crossbar switch

                                          requires much more routing than a systolic array with the same gate count)

                                          Since unused routing tracks increase the cost (and decrease the performance)

                                          of the part without providing any benefit FPGA manufacturers try to provide

                                          58

                                          just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                          can be routed This is determined by estimates such as those derived from

                                          Rents rule or by experiments with existing designs

                                          The FPGA is an array or island-style FPGA It consists of an array of logic

                                          blocks and routing channels Two IO pads fit into the height of one row or the

                                          width of one column as shown below All the routing channels have the same

                                          width (number of wires)

                                          121 FPGA structure

                                          A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                          flip-flop as shown below In recent years manufacturers have started moving

                                          to 6-input LUTs in their high performance parts claiming increased

                                          performance

                                          Typical logic block

                                          59

                                          There is only one output which can be either the registered or the unregistered

                                          LUT output The logic block has four inputs for the LUT and a clock input

                                          Since clock signals (and often other high-fanout signals) are normally routed

                                          via special-purpose dedicated routing networks in commercial FPGAs they

                                          and other signals are separately managed

                                          For this example architecture the locations of the FPGA logic block pins are

                                          shown below

                                          Logic Block Pin Locations

                                          Each input is accessible from one side of the logic block while the output pin

                                          can connect to routing wires in both the channel to the right and the channel

                                          below the logic block

                                          Each logic block output pin can connect to any of the wiring segments in the

                                          channels adjacent to it

                                          Similarly an IO pad can connect to any one of the wiring segments in the

                                          channel adjacent to it For example an IO pad at the top of the chip can

                                          connect to any of the W wires (where W is the channel width) in the horizontal

                                          channel immediately below it

                                          Generally the FPGA routing is unsegmented That is each wiring segment

                                          spans only one logic block before it terminates in a switch box By turning on

                                          60

                                          some of the programmable switches within a switch box longer paths can be

                                          constructed For higher speed interconnect some FPGA architectures use

                                          longer routing lines that span multiple logic blocks

                                          Whenever a vertical and a horizontal channel intersect there is a switch box In

                                          this architecture when a wire enters a switch box there are three

                                          programmable switches that allow it to connect to three other wires in adjacent

                                          channel segments The pattern or topology of switches used in this

                                          architecture is the planar or domain-based switch box topology In this switch

                                          box topology a wire in track number one connects only to wires in track

                                          number one in adjacent channel segments wires in track number 2 connect

                                          only to other wires in track number 2 and so on The figure below illustrates

                                          the connections in a switch box

                                          Switch box topology

                                          Modern FPGA families expand upon the above capabilities to include higher

                                          level functionality fixed into the silicon Having these common functions

                                          embedded into the silicon reduces the area required and gives those functions

                                          increased speed compared to building them from primitives Examples of these

                                          include multipliers generic DSP blocks embedded processors high speed IO

                                          logic and embedded memories

                                          61

                                          FPGAs are also widely used for systems validation including pre-silicon

                                          validation post-silicon validation and firmware development This allows chip

                                          companies to validate their design before the chip is produced in the factory

                                          reducing the time to market

                                          124 FPGA Design and Programming

                                          To define the behavior of the FPGA the user provides a hardware description

                                          language (HDL) or a schematic design The HDL form might be easier to work

                                          with when handling large structures because its possible to just specify them

                                          numerically rather than having to draw every piece by hand On the other hand

                                          schematic entry can allow for easier visualisation of a design

                                          Then using an electronic design automation tool a technology-mapped netlist

                                          is generated The netlist can then be fitted to the actual FPGA architecture

                                          using a process called place-and-route usually performed by the FPGA

                                          companys proprietary place-and-route software The user will validate the

                                          map place and route results via timing analysis simulation and other

                                          verification methodologies Once the design and validation process is

                                          complete the binary file generated (also using the FPGA companys

                                          proprietary software) is used to (re)configure the FPGA

                                          Going from schematicHDL source files to actual configuration The source

                                          files are fed to a software suite from the FPGACPLD vendor that through

                                          different steps will produce a file This file is then transferred to the

                                          FPGACPLD via a serial interface (JTAG) or to an external memory device

                                          like an EEPROM

                                          The most common HDLs are VHDL and Verilog although in an attempt to

                                          reduce the complexity of designing in HDLs which have been compared to the

                                          equivalent of assembly languages there are moves to raise the abstraction level

                                          through the introduction of alternative languages

                                          62

                                          To simplify the design of complex systems in FPGAs there exist libraries of

                                          predefined complex functions and circuits that have been tested and optimized

                                          to speed up the design process These predefined circuits are commonly called

                                          IP cores and are available from FPGA vendors and third-party IP suppliers

                                          (rarely free and typically released under proprietary licenses) Other

                                          predefined circuits are available from developer communities such as Open

                                          Cores (typically free and released under the GPL BSD or similar license) and

                                          other sources

                                          In a typical design flow an FPGA application developer will simulate the

                                          design at multiple stages throughout the design process Initially the RTL

                                          description in VHDL or Verilog is simulated by creating test benches to

                                          simulate the system and observe results Then after the synthesis engine has

                                          mapped the design to a netlist the netlist is translated to a gate level

                                          description where simulation is repeated to confirm the synthesis proceeded

                                          without errors Finally the design is laid out in the FPGA at which point

                                          propagation delays can be added and the simulation run again with these

                                          values back-annotated onto the netlist

                                          Basic Process Technology Types

                                          SRAM - based on static memory technology In-system programmable and re-

                                          programmable Requires external boot devices CMOS

                                          Antifuse - One-time programmable CMOS

                                          EPROM - Erasable Programmable Read-Only Memory technology Usually

                                          one-time programmable in production because of plastic packaging

                                          Windowed devices can be erased with ultraviolet (UV) light CMOS

                                          EEPROM - Electrically Erasable Programmable Read-Only Memory

                                          technology Can be erased even in plastic packages Some but not all

                                          EEPROM devices can be in-system programmed CMOS

                                          Flash - Flash-erase EPROM technology Can be erased even in plastic

                                          packages Some but not all flash devices can be in-system programmed

                                          63

                                          Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                          therefore less expensive to manufacture CMOS

                                          Fuse - One-time programmable Bipolar

                                          Major Manufacturers

                                          Xilinx and Altera are the current FPGA market leaders and long-time industry

                                          rivals Together they control over 80 percent of the market with Xilinx alone

                                          representing over 50 percent

                                          Xilinx also provides free Windows and Linux design software while Altera

                                          provides free Windows tools the Solaris and Linux tools are only available via

                                          a rental scheme

                                          Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                          (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                          Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                          focused CSSP no general purpose FPGAs)

                                          125 FPGA prototype

                                          FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                          prototyping is the method to prototype SoC and ASIC design on FPGA for

                                          hardware verification and early software development

                                          Main stream verification methods for hardware design and early software and

                                          firmware co-design has become mainstream Prototyping SoC and ASIC

                                          design on FPGA has become a good method to do this

                                          64

                                          Reasons why Prototyping is important

                                          1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                          functionally correct This is compared to designers only relying on software

                                          simulations to verify that their hardware design is sound Simulation speed and

                                          modeling accuracy limitations hinder this development

                                          2 Due to time constrains many projects cannot wait until the silicon is back from

                                          the foundry to start on software tests FPGA prototyping allows for much more

                                          time in area of software development and testing at the software-hardware

                                          integration stage This allows many unforeseen software bugs that appear due

                                          to todays array of operating systems applications and hardware

                                          3 Prototyping also allows the developer to ensure that all IP technologies on his

                                          system work well together off the simulation stage and in actual form

                                          4 Prototyping has the added advantage as demo platforms to SoC clients

                                          bringing in interest early This speeds up the overall development cycle and

                                          allows for more enhancement or improvement to the chip features as it would

                                          otherwise have been

                                          65

                                          13 SIMULATION RESULTS

                                          66

                                          GATE LEVEL

                                          FigVGA CONTROLLER

                                          67

                                          GATE LEVEL

                                          Fig SYNC UNIT

                                          68

                                          TECHNOLOGY SCHEMATIC

                                          69

                                          14 APPLICATIONS

                                          Motion Tracking Systems

                                          Image processing systems

                                          Displaying Systems

                                          141 CONCLUTION

                                          Hence by implementing FPGA taking a processor-generated picture

                                          (pixilated) from memory space the Controller provides digital RGB values for

                                          each pixel as well as horizontal and vertical synchronization signals in order

                                          to correctly display the picture on a connected monitor

                                          70

                                          REFERENCES

                                          1 httpenwikipediaorgwikiVideo_Graphics_Array

                                          2 A VGA display controller by Eduardo Sanchez

                                          3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                          71

                                          • Pixel
                                            • 6 SIGNALS IN VGA CONTROLLER
                                            • Gates
                                            • Market size
                                            • 122 FPGA Comparisons

                                            91 VGA PIXEL UNIT

                                            The Pixel Unit provides access to the pixilated image stored in external

                                            picture memory reading in data a byte at a time and formatting each pixel to

                                            be displayed For each pixel the 6-bit RGB value required for the control of

                                            the monitorrsquos three primary color electron guns is generated so that the pixel

                                            is displayed on the screen with the correct color

                                            Accessing the stored image

                                            The image to be displayed on the monitor screen is written by the host

                                            microcontroller into external memory space (RAM) This memory space can

                                            be located anywhere (eg using a block of RAM within the FPGA design or

                                            using the SRAM on the NanoBoard itself)

                                            Pixel data is stored in the memory space in bytes The number of pixels in a

                                            byte depends on which color palette is being used for the image (selected by

                                            the CMOD input)

                                            bull in Black amp White mode ndash one byte of data in RAM contains 8 1-bit pixels

                                            bull in 16 Colors mode ndash one byte of data in RAM contains 2 4-bit pixels

                                            bull in 64 Colors mode ndash one byte of data in RAM contains 1 6-bit pixel

                                            (DATA[76] are not used)

                                            22

                                            The size of memory required to store a picture is determined by the total

                                            number of viewable pixels in a line (determined by DISPSIZE_H) the total

                                            number of viewable lines in a frame (determined by DISPSIZE_V) and the

                                            number of pixels stored in each byte in memory space

                                            Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                                            The address in RAM where the next pixel is stored is determined using

                                            an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                                            reflects the current contents of the MEMADDR register The exact addressing

                                            is described below

                                            92 Black amp White mode

                                            The picture memory address ndash the byte of data containing the next 8 pixels

                                            ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                                            three

                                            ADDR_PIXEL = 00 amp hvcnt[183]

                                            16 Colors mode

                                            The picture memory address ndash the byte of data containing the next 2 pixels

                                            ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                                            one

                                            ADDR_PIXEL = 0 amp hvcnt[181]

                                            64 Colors mode

                                            The picture memory address ndash the byte of data containing the next pixel ndash is

                                            determined by using the full value of hvcnt

                                            ADDR_PIXEL = hvcnt[180]

                                            The Pixel register (PIXREG)

                                            The Pixel register is used to receive the byte of pixel data read from the

                                            current address in memory space The register PIXREG is reset to zero when

                                            the VGA Controller receives an external reset signal (RST)

                                            23

                                            The register is updated on each rising edge of the CLK signal Data can be

                                            read from the memory space as long as the RD signal is active (High) The RD

                                            signal is itself controlled by the external line display enable signal enex This

                                            internally generated signal is defined as follows

                                            If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                                            (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                                            Else

                                            enex = 1

                                            When enex is Low read access from memory is disabled (RD = 0)

                                            The point at which data is loaded from memory into PIXREG depends on the

                                            particular color palette that is chosen ndash Black amp White 16 Colors or 64

                                            Colors

                                            Black amp White mode

                                            The next byte of data will be loaded into the register whenever the

                                            lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                                            are ldquo000rdquo

                                            For the currently loaded byte the active pixel is always in the lowest bit

                                            position of the Pixel register Each pixel in the data byte is moved into this

                                            active pixel position by shifting the contents of the register right by one bit on

                                            each rising edge of CLK

                                            16 Colors mode

                                            The next byte of data will be loaded into the register whenever the

                                            lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                                            0

                                            For the currently loaded byte the active pixel is always in the low order

                                            nibble of the Pixel register Remember that in this mode each byte of data

                                            contains two pixels The second pixel is moved into this active pixel position

                                            by shifting the contents of the register right by four bits on the rising edge of

                                            CLK

                                            24

                                            64 Colors mode

                                            The next byte of data will be loaded into the register on the rising edge

                                            of the external system clock signal (CLK) In this mode the read of pixel data

                                            does not depend on the status of the hvcnt signal received from the

                                            Synchronization Unit

                                            93 The RGB register (RGBREG)

                                            The RGB register is used to store the six bits that are required for

                                            driving the red green and blue color guns of the target monitor When the

                                            chosen color palette is either Black amp White or 16 Colors these six bits are

                                            obtained by mapping the value of the active pixel to a predefined RGB code

                                            When the chosen color palette is 64 Colors the actual pixel value is used

                                            directly

                                            The register RGBREG is reset to zero (000000) when the VGA

                                            Controller receives an external reset signal (RST) This RGB code represents

                                            black

                                            The register is updated on each rising edge of the CLK signal and the

                                            value loaded is dependent on the state of the line display enable signal en

                                            When en is Low blanking is required and RGBREG is loaded with the code

                                            for black (000000)

                                            Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                            and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                            25

                                            26

                                            27

                                            The RGB color code stored in the RGB register is output from the VGA

                                            Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                            and B1)

                                            The monitor itself expects analog signals as inputs to its electron gun control

                                            circuits This is achieved by using 2-bit digital to analog converter circuitry

                                            located on the Nano Board itself as shown in Figure 4

                                            Figure 91 Digital to Analog RGB conversion

                                            28

                                            For each color the 2-bit digital signal from the VGA Controller can be

                                            converted into 4 distinct analog levels These levels specify the intensity of

                                            each of the three primary colors to use when displaying the pixel on the

                                            monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                            brightness)

                                            With each analog input being one of four possible levels the monitor

                                            can display each pixel on the screen with one of 64 different color

                                            permutations

                                            Using Block RAM in Spartan-3E Generation

                                            FPGAs

                                            For applications requiring large on-chip memories Spartantrade-3

                                            Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                            Using various configuration options Select RAM blocks create RAM ROM

                                            FIFOs large look-up tables data width converterscircular buffers and shift

                                            registers each supporting various data widths and depths This application note

                                            describes the features and capabilities of block Select RAM and illustrates how

                                            to specify the various options using the Xilinx CORE Generatortrade system or

                                            via VHDL or Verilog instantiation

                                            29

                                            Table 2 Block RAM Available in Spartan-3E Devices

                                            Devi

                                            ce

                                            R

                                            A

                                            M

                                            Co

                                            lu

                                            m

                                            ns

                                            R

                                            A

                                            M

                                            Bl

                                            oc

                                            ks

                                            pe

                                            r

                                            C

                                            ol

                                            u

                                            m

                                            n

                                            T

                                            o

                                            t

                                            a

                                            l

                                            R

                                            A

                                            M

                                            B

                                            l

                                            o

                                            c

                                            k

                                            s

                                            T

                                            o

                                            t

                                            a

                                            l

                                            R

                                            A

                                            M

                                            B

                                            it

                                            s

                                            T

                                            o

                                            t

                                            a

                                            l

                                            R

                                            A

                                            M

                                            K

                                            b

                                            i

                                            t

                                            s

                                            XC3

                                            S100

                                            E

                                            1 4 4 7

                                            3

                                            7

                                            2

                                            8

                                            7

                                            2

                                            XC3

                                            S250

                                            E

                                            2 6 1

                                            2

                                            2

                                            2

                                            1

                                            1

                                            8

                                            4

                                            2

                                            1

                                            6

                                            XC3

                                            S500

                                            E

                                            2 10 2

                                            0

                                            3

                                            6

                                            8

                                            6

                                            4

                                            3

                                            6

                                            0

                                            30

                                            0

                                            XC3

                                            S120

                                            0E

                                            2 14 2

                                            8

                                            5

                                            1

                                            6

                                            0

                                            9

                                            6

                                            5

                                            0

                                            4

                                            XC3

                                            S160

                                            0E

                                            2 18 3

                                            6

                                            6

                                            6

                                            3

                                            5

                                            5

                                            2

                                            6

                                            4

                                            8

                                            Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                            which is allocated to data storage and in some memory configurations an

                                            additional 2K bits allocated to parity or additional plus data bits Physically

                                            the block RAM memory has two completely independent access ports labeled

                                            Port A and Port B The structure is fully symmetrical and both ports are

                                            interchangeable and both ports support data read and write operations Each

                                            memory port is synchronous with its own clock clock enable and write

                                            enable Read operations are also synchronous and require a clock edge and

                                            clock enable Though physically a dual-port memory block RAM simulates

                                            single-port memory in an application as shown in Figure 1 Furthermore each

                                            block memory supports multiple configurations or aspect ratios Table 3

                                            summarizes the essential SelectRAM features Cascade multiple block RAMs

                                            to create deeper and wider memory organizations with a minimal timing

                                            penalty incurred through specialized routing resources

                                            31

                                            32

                                            10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                            Why (V) HDL

                                            Interoperability

                                            Technology independence

                                            Design reuse

                                            Several levels of abstraction

                                            Readability

                                            Standard language

                                            Widely supported

                                            What is VHDL

                                            VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                            IC)

                                            Design specification language

                                            Design entry language

                                            Design simulation language

                                            Design documentation language

                                            An alternative to schematics

                                            101 Brief History

                                            VHDL Was developed in the early 1980s for managing design problems that

                                            involved large circuits and multiple teams of engineers

                                            Funded by US Department of Defence

                                            33

                                            The first publicly available version was released in 1985

                                            In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                            presented with a proposal to standardize the VHDL

                                            In 1987 standardization =gt IEEE 1076-1987

                                            An improved version of the language was released in 1994 =gt IEEE

                                            standard1076-1993

                                            102 Related Standards

                                            IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                            impedance

                                            Soon after IEEE 1076-1987 was released simulator companies began using

                                            their own non-standard types =gt VHDL was becoming a nonstandard

                                            IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                            definitions for a nine-valued data type std_logic

                                            IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                            to actual hardware

                                            Defines eg two numeric types signed and unsigned

                                            VHDL Environment

                                            Design Units

                                            34

                                            Segments of VHDL code that can be compiled separately and stored in a

                                            library

                                            Entities

                                            A black box with interface definition

                                            Defines the inputsoutputs of a component (define pins)

                                            A way to represent modularity in VHDL

                                            Similar to symbol in schematic

                                            Entity declaration describes entity

                                            Eg

                                            entity Comparator is

                                            port (A B in std_logic_vector(7 downto0)

                                            EQ out std_logic)

                                            end Comparator

                                            Ports Provide channels of communication between the component and its

                                            environment

                                            Each port must have a name direction and a type

                                            An entity may have NO port declaration

                                            35

                                            Port directions In A value of a port can be read inside the component but cannot be assigned

                                            Multiple reads of port are allowed

                                            Out Assignments can be made to a port but data from a port cannot be read

                                            Multiple assignments are allowed

                                            In out Bi-directional assignments can be made and data can be read Multiple

                                            assignments are allowed

                                            Buffer An out port with read capability May have at most one assignment

                                            (are not recommended)

                                            Architectures Every entity has at least one architecture

                                            One entity can have several architectures

                                            Architectures can describe design using

                                            BehaviorndashStructurendashDataflow

                                            Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                            Transfer Level)ndashBehavioral level

                                            Configuration declaration links architecture to entity

                                            Eg

                                            Architecture Comparator1 of Comparator is

                                            Begin

                                            EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                            End Comparator1

                                            Configurations Links entity declaration and architecture body together

                                            Concept of default configuration is a bit messy in VHDL lsquo87

                                            ndashLast architecture analyzed links to entity

                                            Can be used to change simulation behavior without re-analyzing the VHDL

                                            source

                                            Complex configuration declarations are ignored in synthesis

                                            36

                                            Some entities can have eggate level architecture and behavioral

                                            architecture

                                            Are always optional

                                            PackagesPackages contain information common to many design units

                                            1 Package declaration

                                            --constant declarations

                                            ndashtype and subtype declarations

                                            ndashfunction and procedure declarations

                                            ndashglobal signal declarations

                                            ndashfile declarations

                                            ndashcomponent declarations

                                            2 Package body

                                            ndashis not necessary needed

                                            ndashfunction bodies

                                            ndashprocedure bodies

                                            Packages are meant for encapsuling data which can be shared globally among

                                            Several design units

                                            Consists of declaration part and optional body part

                                            Package declaration can contain

                                            ndashtype and subtype declarations

                                            ndashsubprograms

                                            ndashconstants

                                            Alias declarations

                                            ndashglobal signal declarations

                                            ndashfile declarations

                                            ndashcomponent declarations

                                            37

                                            Package body consists of

                                            ndashsubprogram declarations and bodies

                                            ndashtype and subtype declarations

                                            ndash deferred constants

                                            ndash file declarations

                                            LibrariesCollection of VHDL design units (database)

                                            1 Packages

                                            package declaration

                                            package body

                                            2 Entities (entity declaration)

                                            3 Architectures (architecture body)

                                            4 Configurations (configuration declarations)

                                            Usually directory in UNIX file system

                                            Can be also any other kind of database

                                            Levels of Abstraction

                                            VHDL supports many possible styles of design description which differ

                                            primarily in how closely they relate to the HW

                                            It is possible to describe a circuit in a number of ways

                                            Structural-------

                                            Dataflow ------- Higher level of abstraction

                                            Behavioral -------

                                            Structural VHDL description

                                            Circuit is described in terms of its components

                                            From a low-level description (eg transistor-level description) to a high level

                                            description (eg block diagram)

                                            For large circuits a low-level description quickly becomes impractical

                                            103 Dataflow VHDL Description

                                            38

                                            Circuit is described in terms of how data moves through the system

                                            In the dataflow style you describe how information flows between registers in

                                            the system

                                            The combinational logic is described at a relatively high level the placement

                                            and operation of registers is specified quite precisely

                                            The behavior of the system over the time is defined by registers

                                            There are no build-in registers in VHDL-language

                                            ndashEither lower level description

                                            ndashor behavioral description of sequential elements is needed

                                            The lower level register descriptions must be created or obtained

                                            If there is no 3rd party models for registers =gt you must write the behavioral

                                            description of registers

                                            The behavioral description can be provided in the form of

                                            subprograms(functions or procedures)

                                            104 Behavioral VHDL Description

                                            Circuit is described in terms of its operation over time

                                            39

                                            Representation might include eg state diagrams timing diagrams and

                                            algorithmic descriptions

                                            The concept of time may be expressed precisely using delays (eg A lt= B

                                            after 10 ns)

                                            If no actual delay is used order of sequential operations is defined

                                            In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                            timing specifications

                                            The actual timing results depend on implementation technology and efficiency

                                            of synthesis tool

                                            There are a few tools for behavioral synthesis

                                            Concurrent Vs Sequential

                                            Processes

                                            Basic simulation concept in VHDL

                                            VHDL description can always be broken up to interconnected processes

                                            Quite similar to Unix process

                                            40

                                            Process keyword in VHDL

                                            Process statement is concurrent statement

                                            Statements inside process statements are sequential statements

                                            Process must contain either sensitivity list or wait statement(s) but NOT both

                                            Sensitivity list or wait statement(s) contains signals which wakes process up

                                            General format

                                            Process [(sensitivity_list)]

                                            process_declarative_part

                                            Begin

                                            process_statements

                                            [wait_statement]

                                            End process

                                            41

                                            11 VGA CONTROLLER CODE

                                            library IEEE

                                            use IEEESTD_LOGIC_1164ALL

                                            use IEEESTD_LOGIC_ARITHALL

                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                            use IEEEnumeric_stdALL

                                            ---- Uncomment the following library declaration if instantiating

                                            ---- any Xilinx primitives in this code

                                            --library UNISIM

                                            --use UNISIMVComponentsall

                                            entity vga_controller is port(clk50min std_logic

                                            rstnin std_logic

                                            hsyncout std_logic

                                            vsyncout std_logic

                                            redout std_logic_vector(0 downto 0)

                                            greenout std_logic_vector(0 downto 0)

                                            blueout std_logic_vector(0 downto 0))

                                            end vga_controller

                                            architecture Behavioral of vga_controller is

                                            component memory

                                            port (

                                            addr IN std_logic_VECTOR(17 downto 0)

                                            clk IN std_logic

                                            dout OUT std_logic_VECTOR(0 downto 0)

                                            en IN std_logic

                                            sinit IN std_logic)

                                            END component

                                            42

                                            component sync_unit port(

                                            clkin std_logic

                                            rstnin std_logic

                                            hsyncout std_logic

                                            ---horizontal synch pulse(row)

                                            vsyncout std_logic

                                            --vertical sync pulse(frame) 111110100000000000

                                            v_en_regout std_logic

                                            h_enout std_logic

                                            v_enout std_logic

                                            line_countinout integer

                                            c_hsinout integer)

                                            end component

                                            component pixelgen port(

                                            clkin std_logic

                                            rstnin std_logic

                                            hsyncin std_logic

                                            vsyncin std_logic

                                            h_enin std_logic

                                            v_enin std_logic

                                            v_en_reg in std_logic

                                            data_inin std_logic_vector(0 downto 0)

                                            line_countin integer

                                            c_hsin integer

                                            red out std_logic_vector(0 downto 0)

                                            greenout std_logic_vector(0 downto 0)

                                            blueout std_logic_vector(0 downto 0)

                                            addrout std_logic_vector(17 downto 0)

                                            rdout std_logic)

                                            end component

                                            43

                                            signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                            signal line_count_sc_hs_saddrinteger

                                            signal data_sstd_logic_vector(0 downto 0)

                                            signal addr_sstd_logic_vector(17 downto 0)

                                            begin

                                            --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                            process(clk50mrstn)

                                            begin

                                            if(rstn = 0)then

                                            clk lt= 0

                                            elsif(clk50m = 1 and clk50mevent)then

                                            clk lt= not clk

                                            end if

                                            --end if

                                            end process

                                            hsync lt= hsync_s

                                            vsync lt= vsync_s

                                            sync sync_unit port map(

                                            clk =gt clk

                                            rstn =gt rstn

                                            hsync =gt hsync_s

                                            vsync =gt vsync_s

                                            v_en_reg =gt v_en_reg_s

                                            line_count =gt line_count_s

                                            h_en =gt h_en_s

                                            v_en =gt v_en_s

                                            c_hs =gt c_hs_s

                                            )

                                            44

                                            pixel pixelgen port map(

                                            clk =gt clk

                                            rstn =gt rstn

                                            hsync =gt hsync_s

                                            vsync =gt vsync_s

                                            h_en =gt h_en_s

                                            v_en =gt v_en_s

                                            v_en_reg =gt v_en_reg_s

                                            data_in =gt data_s

                                            line_count=gt line_count_s

                                            c_hs =gt c_hs_s

                                            red =gt red

                                            green =gt green

                                            blue =gt blue

                                            addr =gt addr_s

                                            rd =gt rd_s

                                            )

                                            mem12 memory port map(

                                            addr =gt addr_s

                                            clk =gt clk

                                            dout =gt data_s

                                            en =gt rd_s

                                            sinit=gt rstn

                                            )

                                            end Behavioral

                                            45

                                            SYNCHRONIZATION BLOCK CODE

                                            library IEEE

                                            use IEEESTD_LOGIC_1164ALL

                                            use IEEESTD_LOGIC_ARITHALL

                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                            entity sync_unit is port(

                                            clkin std_logic

                                            rstnin std_logic

                                            hsyncout std_logic ---horizontal synch pulse(row)

                                            vsyncout std_logic --vertical sync

                                            pulse(frame) 111110100000000000

                                            v_en_regout std_logic

                                            h_enout std_logic

                                            v_enout std_logic

                                            line_countout integer

                                            c_hsout integer )

                                            end sync_unit

                                            architecture arch_sync_unit of sync_unit is

                                            signal h_en_sv_en_sstd_logic

                                            signal line_count_sc_hs_sinteger

                                            begin

                                            c_hs lt= c_hs_s

                                            line_count lt= line_count_s

                                            h_en lt= h_en_s

                                            v_en lt= v_en_s

                                            -- hsync counter

                                            process(clkrstn)

                                            begin

                                            if(rstn = 0)then

                                            46

                                            c_hs_s lt= 0

                                            else

                                            if(clk = 1 and clkevent)then

                                            if(c_hs_s=793)then

                                            c_hs_s lt= 0

                                            else

                                            c_hs_s lt= c_hs_s +1

                                            end if

                                            end if

                                            end if

                                            end process

                                            --vsync counter

                                            --vertical line counter

                                            process(h_en_srstn)

                                            begin

                                            if(rstn = 0)then

                                            line_count_s lt= 0

                                            elsif(h_en_s = 1 and h_en_sevent)then

                                            line_count_s lt= line_count_s + 1

                                            if (line_count_s=515)then

                                            line_count_s lt= 0

                                            end if

                                            end if

                                            --end if

                                            end process

                                            --hysnc pulse

                                            process(clkrstn)

                                            begin

                                            if(rstn = 0)then

                                            47

                                            hsync lt= 1

                                            elsif(clk = 1 and clkevent)then

                                            if(c_hs_s lt=95)then

                                            hsync lt= 0

                                            else

                                            hsync lt= 1

                                            end if

                                            end if

                                            --end if

                                            end process

                                            process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                            this period)

                                            begin

                                            if(rstn = 0)then

                                            h_en_s lt= 0

                                            elsif(clk = 1 and clkevent)then

                                            if((c_hs_s gt=143) and (c_hs_slt=778))then

                                            h_en_s lt= 1

                                            else

                                            h_en_s lt= 0

                                            end if

                                            end if

                                            --end if

                                            end process

                                            ---vysnc pulse

                                            process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                            begin

                                            if(rstn = 0)then

                                            48

                                            v_en_s lt= 0

                                            elsif(clk = 1 and clkevent)then

                                            if((line_count_sgt=35 and line_count_slt=515))then

                                            v_en_s lt= 1

                                            else

                                            v_en_s lt= 0

                                            end if

                                            end if

                                            --end if

                                            end process

                                            process(clkrstn) --Vertical Sync Front porch Back Porch

                                            begin

                                            if(rstn = 0)then

                                            vsync lt= 1

                                            elsif(clk = 1 and clkevent)then

                                            if(line_count_s lt2)then

                                            vsync lt= 0

                                            else

                                            vsync lt= 1

                                            end if

                                            end if

                                            --end if

                                            end process

                                            process(clk)

                                            begin

                                            if(clk = 1 and clkevent)then

                                            v_en_reglt=v_en_s

                                            end if

                                            end process

                                            end arch_sync_unit

                                            49

                                            PIXEL GENERATOR

                                            library IEEE

                                            use IEEESTD_LOGIC_1164ALL

                                            use IEEESTD_LOGIC_ARITHALL

                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                            entity pixelgen is port(

                                            clkin std_logic

                                            rstnin std_logic

                                            hsyncin std_logic

                                            vsyncin std_logic

                                            h_enin std_logic

                                            v_enin std_logic

                                            v_en_reg in std_logic

                                            data_inin std_logic_vector(0 downto 0)

                                            line_countin integer

                                            c_hsin integer

                                            red out std_logic_vector(0 downto 0)

                                            greenout std_logic_vector(0 downto 0)

                                            blueout std_logic_vector(0 downto 0)

                                            addrout std_logic_vector(17 downto 0)

                                            rdout std_logic

                                            )

                                            end pixelgen

                                            architecture behv of pixelgen is

                                            signal addr_sstd_logic_vector(17 downto 0)

                                            begin

                                            addr lt= addr_s

                                            process(clkrstn)

                                            begin

                                            50

                                            if(rstn = 0)then

                                            rd lt= 0

                                            red lt= 0

                                            green lt= 0

                                            blue lt= 0

                                            addr_s lt= x0000 amp 00

                                            elsif (clk = 1 and clkevent)then

                                            if((v_en=1) and (v_en_reg=0))then

                                            addr_slt=(others =gt0)

                                            end if

                                            if(v_en=1 and h_en=1)then

                                            if(line_countlt=514)then

                                            if(c_hs lt= 655)then

                                            rd lt= 1

                                            red lt= data_in

                                            green lt= data_in

                                            blue lt= data_in

                                            addr_s lt=addr_s +000000000000000001

                                            else

                                            rd lt= 0

                                            red lt= 0

                                            green lt= 1

                                            blue lt= 0

                                            end if

                                            else

                                            rd lt= 0

                                            red lt= 1

                                            green lt= 1

                                            blue lt= 1

                                            51

                                            end if

                                            if(addr_s=111011111111111111)then

                                            addr_slt= (others =gt0)

                                            end if

                                            else

                                            red lt= 0

                                            green lt= 0

                                            blue lt= 0

                                            end if

                                            end if

                                            -- end if

                                            --end if

                                            end process

                                            end behv

                                            52

                                            12 FPGA ARCHITECTURE

                                            A field-programmable gate array (FPGA) is a semiconductor device that can

                                            be configured by the customer or designer after manufacturingmdashhence the

                                            name field-programmable FPGAs are programmed using a logic circuit

                                            diagram or a source code in a hardware description language (HDL) to specify

                                            how the chip will work They can be used to implement any logical function

                                            that an application-specific integrated circuit (ASIC) could perform but the

                                            ability to update the functionality after shipping offers advantages for many

                                            applications

                                            FPGAs contain programmable logic components called logic blocks and a

                                            hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                            togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                            can be configured to perform complex combinational functions or merely

                                            simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                            include memory elements which may be simple flip-flops or more complete

                                            blocks of memory

                                            History The FPGA industry sprouted from programmable read only

                                            memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                            both had the option of being programmed in batches in a factory or in the field

                                            (field programmable) however programmable logic was hard-wired between

                                            logic gates

                                            Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                            first commercially viable field programmable gate array in 1985 ndash the

                                            XC2064 The XC2064 had programmable gates and programmable

                                            interconnects between gates the beginnings of a new technology and market

                                            The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                            input lookup tables (LUTs) More than 20 years later Freeman was entered

                                            into the National Inventors Hall of Fame for his invention

                                            53

                                            Some of the industryrsquos foundational concepts and technologies for

                                            programmable logic arrays gates and logic blocks are founded in patents

                                            awarded to David W Page and LuVerne R Peterson in 1985

                                            In the late 1980s the Naval Surface Warfare Department funded an experiment

                                            proposed by Steve Casselman to develop a computer that would implement

                                            600000 reprogrammable gates Casselman was successful and the system was

                                            awarded a patent in 1992

                                            Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                            1990s when competitors sprouted up eroding significant market-share By

                                            1993 Actel was serving about 18 percent of the market

                                            The 1990s were an explosive period of time for FPGAs both in sophistication

                                            and the volume of production In the early 1990s FPGAs were primarily used

                                            in telecommunications and networking By the end of the decade FPGAs

                                            found their way into consumer automotive and industrial applications

                                            FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                            algorithm technology and FPGAs to create a sound recognition device

                                            Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                            to decide the configuration needed to accomplish a sound recognition task

                                            121 Modern developments

                                            A recent trend has been to take the coarse-grained architectural

                                            approach a step further by combining the logic blocks and interconnects of

                                            traditional FPGAs with embedded microprocessors and related peripherals to

                                            form a complete system on a programmable chip This work mirrors the

                                            architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                            Group which combined a reconfigurable CPU architecture on a single chip

                                            called the SB24 That work was done in 1982 Examples of such hybrid

                                            technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                            which include one or more PowerPC processors embedded within the FPGAs

                                            54

                                            logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                            processor in combination with Atmels programmable logic architecture

                                            An alternate approach to using hard-macro processors is to make use of soft

                                            processor cores that are implemented within the FPGA logic (See Soft

                                            processors below)

                                            As previously mentioned many modern FPGAs have the ability to be

                                            reprogrammed at run time and this is leading to the idea of reconfigurable

                                            computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                            suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                            example of a reconfigurable soft processor implemented on FPGAs However

                                            it does not support dynamic reconfiguration at runtime but instead adapts itself

                                            to a specific program

                                            Additionally new non-FPGA architectures are beginning to emerge

                                            Software-configurable microprocessors such as the Stretch S5000 adopt a

                                            hybrid approach by providing an array of processor cores and FPGA-like

                                            programmable cores on the same chip

                                            Gates

                                            1987 9000 gates Xilinx

                                            1992 600000 Naval Surface Warfare Department

                                            Early 2000s Millions

                                            Market size

                                            1985 First commercial FPGA technology invented by Xilinx

                                            1987 $14 million

                                            ~1993 gt$385 million

                                            2005 $19 billion

                                            2010 estimates $275 billion

                                            55

                                            122 FPGA Comparisons

                                            Historically FPGAs have been slower less energy efficient and generally

                                            achieved less functionality than their fixed ASIC counterparts A combination

                                            of volume fabrication improvements research and development and the IO

                                            capabilities of new supercomputers have largely closed the performance gap

                                            between ASICs and FPGAs

                                            Advantages include a shorter time to market ability to re-program in the field

                                            to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                            middle road by developing their hardware on ordinary FPGAs but

                                            manufacture their final version so it can no longer be modified after the design

                                            has been committed

                                            Xilinx claims that several market and technology dynamics are changing the

                                            ASICFPGA paradigm

                                            IC costs are rising aggressively

                                            ASIC complexity has bolstered development time and costs

                                            RampD resources and headcount is decreasing

                                            Revenue losses for slow time-to-market are increasing

                                            Financial constraints in a poor economy are driving low-cost technologies

                                            These trends make FPGAs a better alternative than ASICs for a growing

                                            number of higher-volume applications than they have been historically used

                                            for which the company blames for the growing number of FPGA design starts

                                            (see History)

                                            The primary differences between CPLDs and FPGAs are architectural A

                                            CPLD has a somewhat restrictive structure consisting of one or more

                                            programmable sum-of-products logic arrays feeding a relatively small number

                                            of clocked registers The result of this is less flexibility with the advantage of

                                            more predictable timing delays and a higher logic-to-interconnect ratio The

                                            FPGA architectures on the other hand are dominated by interconnect This

                                            56

                                            makes them far more flexible (in terms of the range of designs that are

                                            practical for implementation within them) but also far more complex to design

                                            for

                                            Another notable difference between CPLDs and FPGAs is the presence in most

                                            FPGAs of higher-level embedded functions (such as adders and multipliers)

                                            and embedded memories as well as to have logic blocks implement decoders

                                            or mathematical functions

                                            Some FPGAs have the capability of partial re-configuration that lets one

                                            portion of the device be re-programmed while other portions continue running

                                            123 Applications

                                            Applications of FPGAs include digital signal processing software-defined

                                            radio aerospace and defense systems ASIC prototyping medical imaging

                                            computer vision speech recognition cryptography bioinformatics computer

                                            hardware emulation radio astronomy and a growing range of other areas

                                            FPGAs originally began as competitors to CPLDs and competed in a similar

                                            space that of glue logic for PCBs As their size capabilities and speed

                                            increased they began to take over larger and larger functions to the state where

                                            some are now marketed as full systems on chips (SoC) Particularly with the

                                            introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                            applications which had traditionally been the sole reserve of DSPs began to

                                            incorporate FPGAs instead

                                            FPGAs especially find applications in any area or algorithm that can make use

                                            of the massive parallelism offered by their architecture One such area is code

                                            breaking in particular brute-force attack of cryptographic algorithms

                                            FPGAs are increasingly used in conventional high performance computing

                                            applications where computational kernels such as FFT or Convolution are

                                            performed on the FPGA instead of a microprocessor

                                            57

                                            The inherent parallelism of the logic resources on an FPGA allows for

                                            considerable computational throughput even at a low MHz clock rates The

                                            flexibility of the FPGA allows for even higher performance by trading off

                                            precision and range in the number format for an increased number of parallel

                                            arithmetic units This has driven a new type of processing called

                                            reconfigurable computing where time intensive tasks are offloaded from

                                            software to FPGAs

                                            The adoption of FPGAs in high performance computing is currently limited by

                                            the complexity of FPGA design compared to conventional software and the

                                            extremely long turn-around times of current design tools where 4-8 hours wait

                                            is necessary after even minor changes to the source code

                                            Traditionally FPGAs have been reserved for specific vertical applications

                                            where the volume of production is small For these low-volume applications

                                            the premium that companies pay in hardware costs per unit for a

                                            programmable chip is more affordable than the development resources spent

                                            on creating an ASIC for a low-volume application Today new cost and

                                            performance dynamics have broadened the range of viable applications

                                            Architecture

                                            The most common FPGA architecture consists of an array of configurable

                                            logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                            channels have the same width (number of wires) Multiple IO pads may fit

                                            into the height of one row or the width of one column in the array

                                            An application circuit must be mapped into an FPGA with adequate resources

                                            While the number of CLBs and IOs required is easily determined from the

                                            design the number of routing tracks needed may vary considerably even

                                            among designs with the same amount of logic (For example a crossbar switch

                                            requires much more routing than a systolic array with the same gate count)

                                            Since unused routing tracks increase the cost (and decrease the performance)

                                            of the part without providing any benefit FPGA manufacturers try to provide

                                            58

                                            just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                            can be routed This is determined by estimates such as those derived from

                                            Rents rule or by experiments with existing designs

                                            The FPGA is an array or island-style FPGA It consists of an array of logic

                                            blocks and routing channels Two IO pads fit into the height of one row or the

                                            width of one column as shown below All the routing channels have the same

                                            width (number of wires)

                                            121 FPGA structure

                                            A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                            flip-flop as shown below In recent years manufacturers have started moving

                                            to 6-input LUTs in their high performance parts claiming increased

                                            performance

                                            Typical logic block

                                            59

                                            There is only one output which can be either the registered or the unregistered

                                            LUT output The logic block has four inputs for the LUT and a clock input

                                            Since clock signals (and often other high-fanout signals) are normally routed

                                            via special-purpose dedicated routing networks in commercial FPGAs they

                                            and other signals are separately managed

                                            For this example architecture the locations of the FPGA logic block pins are

                                            shown below

                                            Logic Block Pin Locations

                                            Each input is accessible from one side of the logic block while the output pin

                                            can connect to routing wires in both the channel to the right and the channel

                                            below the logic block

                                            Each logic block output pin can connect to any of the wiring segments in the

                                            channels adjacent to it

                                            Similarly an IO pad can connect to any one of the wiring segments in the

                                            channel adjacent to it For example an IO pad at the top of the chip can

                                            connect to any of the W wires (where W is the channel width) in the horizontal

                                            channel immediately below it

                                            Generally the FPGA routing is unsegmented That is each wiring segment

                                            spans only one logic block before it terminates in a switch box By turning on

                                            60

                                            some of the programmable switches within a switch box longer paths can be

                                            constructed For higher speed interconnect some FPGA architectures use

                                            longer routing lines that span multiple logic blocks

                                            Whenever a vertical and a horizontal channel intersect there is a switch box In

                                            this architecture when a wire enters a switch box there are three

                                            programmable switches that allow it to connect to three other wires in adjacent

                                            channel segments The pattern or topology of switches used in this

                                            architecture is the planar or domain-based switch box topology In this switch

                                            box topology a wire in track number one connects only to wires in track

                                            number one in adjacent channel segments wires in track number 2 connect

                                            only to other wires in track number 2 and so on The figure below illustrates

                                            the connections in a switch box

                                            Switch box topology

                                            Modern FPGA families expand upon the above capabilities to include higher

                                            level functionality fixed into the silicon Having these common functions

                                            embedded into the silicon reduces the area required and gives those functions

                                            increased speed compared to building them from primitives Examples of these

                                            include multipliers generic DSP blocks embedded processors high speed IO

                                            logic and embedded memories

                                            61

                                            FPGAs are also widely used for systems validation including pre-silicon

                                            validation post-silicon validation and firmware development This allows chip

                                            companies to validate their design before the chip is produced in the factory

                                            reducing the time to market

                                            124 FPGA Design and Programming

                                            To define the behavior of the FPGA the user provides a hardware description

                                            language (HDL) or a schematic design The HDL form might be easier to work

                                            with when handling large structures because its possible to just specify them

                                            numerically rather than having to draw every piece by hand On the other hand

                                            schematic entry can allow for easier visualisation of a design

                                            Then using an electronic design automation tool a technology-mapped netlist

                                            is generated The netlist can then be fitted to the actual FPGA architecture

                                            using a process called place-and-route usually performed by the FPGA

                                            companys proprietary place-and-route software The user will validate the

                                            map place and route results via timing analysis simulation and other

                                            verification methodologies Once the design and validation process is

                                            complete the binary file generated (also using the FPGA companys

                                            proprietary software) is used to (re)configure the FPGA

                                            Going from schematicHDL source files to actual configuration The source

                                            files are fed to a software suite from the FPGACPLD vendor that through

                                            different steps will produce a file This file is then transferred to the

                                            FPGACPLD via a serial interface (JTAG) or to an external memory device

                                            like an EEPROM

                                            The most common HDLs are VHDL and Verilog although in an attempt to

                                            reduce the complexity of designing in HDLs which have been compared to the

                                            equivalent of assembly languages there are moves to raise the abstraction level

                                            through the introduction of alternative languages

                                            62

                                            To simplify the design of complex systems in FPGAs there exist libraries of

                                            predefined complex functions and circuits that have been tested and optimized

                                            to speed up the design process These predefined circuits are commonly called

                                            IP cores and are available from FPGA vendors and third-party IP suppliers

                                            (rarely free and typically released under proprietary licenses) Other

                                            predefined circuits are available from developer communities such as Open

                                            Cores (typically free and released under the GPL BSD or similar license) and

                                            other sources

                                            In a typical design flow an FPGA application developer will simulate the

                                            design at multiple stages throughout the design process Initially the RTL

                                            description in VHDL or Verilog is simulated by creating test benches to

                                            simulate the system and observe results Then after the synthesis engine has

                                            mapped the design to a netlist the netlist is translated to a gate level

                                            description where simulation is repeated to confirm the synthesis proceeded

                                            without errors Finally the design is laid out in the FPGA at which point

                                            propagation delays can be added and the simulation run again with these

                                            values back-annotated onto the netlist

                                            Basic Process Technology Types

                                            SRAM - based on static memory technology In-system programmable and re-

                                            programmable Requires external boot devices CMOS

                                            Antifuse - One-time programmable CMOS

                                            EPROM - Erasable Programmable Read-Only Memory technology Usually

                                            one-time programmable in production because of plastic packaging

                                            Windowed devices can be erased with ultraviolet (UV) light CMOS

                                            EEPROM - Electrically Erasable Programmable Read-Only Memory

                                            technology Can be erased even in plastic packages Some but not all

                                            EEPROM devices can be in-system programmed CMOS

                                            Flash - Flash-erase EPROM technology Can be erased even in plastic

                                            packages Some but not all flash devices can be in-system programmed

                                            63

                                            Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                            therefore less expensive to manufacture CMOS

                                            Fuse - One-time programmable Bipolar

                                            Major Manufacturers

                                            Xilinx and Altera are the current FPGA market leaders and long-time industry

                                            rivals Together they control over 80 percent of the market with Xilinx alone

                                            representing over 50 percent

                                            Xilinx also provides free Windows and Linux design software while Altera

                                            provides free Windows tools the Solaris and Linux tools are only available via

                                            a rental scheme

                                            Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                            (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                            Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                            focused CSSP no general purpose FPGAs)

                                            125 FPGA prototype

                                            FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                            prototyping is the method to prototype SoC and ASIC design on FPGA for

                                            hardware verification and early software development

                                            Main stream verification methods for hardware design and early software and

                                            firmware co-design has become mainstream Prototyping SoC and ASIC

                                            design on FPGA has become a good method to do this

                                            64

                                            Reasons why Prototyping is important

                                            1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                            functionally correct This is compared to designers only relying on software

                                            simulations to verify that their hardware design is sound Simulation speed and

                                            modeling accuracy limitations hinder this development

                                            2 Due to time constrains many projects cannot wait until the silicon is back from

                                            the foundry to start on software tests FPGA prototyping allows for much more

                                            time in area of software development and testing at the software-hardware

                                            integration stage This allows many unforeseen software bugs that appear due

                                            to todays array of operating systems applications and hardware

                                            3 Prototyping also allows the developer to ensure that all IP technologies on his

                                            system work well together off the simulation stage and in actual form

                                            4 Prototyping has the added advantage as demo platforms to SoC clients

                                            bringing in interest early This speeds up the overall development cycle and

                                            allows for more enhancement or improvement to the chip features as it would

                                            otherwise have been

                                            65

                                            13 SIMULATION RESULTS

                                            66

                                            GATE LEVEL

                                            FigVGA CONTROLLER

                                            67

                                            GATE LEVEL

                                            Fig SYNC UNIT

                                            68

                                            TECHNOLOGY SCHEMATIC

                                            69

                                            14 APPLICATIONS

                                            Motion Tracking Systems

                                            Image processing systems

                                            Displaying Systems

                                            141 CONCLUTION

                                            Hence by implementing FPGA taking a processor-generated picture

                                            (pixilated) from memory space the Controller provides digital RGB values for

                                            each pixel as well as horizontal and vertical synchronization signals in order

                                            to correctly display the picture on a connected monitor

                                            70

                                            REFERENCES

                                            1 httpenwikipediaorgwikiVideo_Graphics_Array

                                            2 A VGA display controller by Eduardo Sanchez

                                            3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                            71

                                            • Pixel
                                              • 6 SIGNALS IN VGA CONTROLLER
                                              • Gates
                                              • Market size
                                              • 122 FPGA Comparisons

                                              The size of memory required to store a picture is determined by the total

                                              number of viewable pixels in a line (determined by DISPSIZE_H) the total

                                              number of viewable lines in a frame (determined by DISPSIZE_V) and the

                                              number of pixels stored in each byte in memory space

                                              Memory required for picture = (PixTotal x LinTotal) number of pixels per byte

                                              The address in RAM where the next pixel is stored is determined using

                                              an internal signal provided by the Synchronization Unit ndash hvcnt ndash which

                                              reflects the current contents of the MEMADDR register The exact addressing

                                              is described below

                                              92 Black amp White mode

                                              The picture memory address ndash the byte of data containing the next 8 pixels

                                              ndash is determined by using bits 183 of hvcnt and right shifting the contents by

                                              three

                                              ADDR_PIXEL = 00 amp hvcnt[183]

                                              16 Colors mode

                                              The picture memory address ndash the byte of data containing the next 2 pixels

                                              ndash is determined by using bits 181 of hvcnt and right shifting the contents by

                                              one

                                              ADDR_PIXEL = 0 amp hvcnt[181]

                                              64 Colors mode

                                              The picture memory address ndash the byte of data containing the next pixel ndash is

                                              determined by using the full value of hvcnt

                                              ADDR_PIXEL = hvcnt[180]

                                              The Pixel register (PIXREG)

                                              The Pixel register is used to receive the byte of pixel data read from the

                                              current address in memory space The register PIXREG is reset to zero when

                                              the VGA Controller receives an external reset signal (RST)

                                              23

                                              The register is updated on each rising edge of the CLK signal Data can be

                                              read from the memory space as long as the RD signal is active (High) The RD

                                              signal is itself controlled by the external line display enable signal enex This

                                              internally generated signal is defined as follows

                                              If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                                              (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                                              Else

                                              enex = 1

                                              When enex is Low read access from memory is disabled (RD = 0)

                                              The point at which data is loaded from memory into PIXREG depends on the

                                              particular color palette that is chosen ndash Black amp White 16 Colors or 64

                                              Colors

                                              Black amp White mode

                                              The next byte of data will be loaded into the register whenever the

                                              lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                                              are ldquo000rdquo

                                              For the currently loaded byte the active pixel is always in the lowest bit

                                              position of the Pixel register Each pixel in the data byte is moved into this

                                              active pixel position by shifting the contents of the register right by one bit on

                                              each rising edge of CLK

                                              16 Colors mode

                                              The next byte of data will be loaded into the register whenever the

                                              lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                                              0

                                              For the currently loaded byte the active pixel is always in the low order

                                              nibble of the Pixel register Remember that in this mode each byte of data

                                              contains two pixels The second pixel is moved into this active pixel position

                                              by shifting the contents of the register right by four bits on the rising edge of

                                              CLK

                                              24

                                              64 Colors mode

                                              The next byte of data will be loaded into the register on the rising edge

                                              of the external system clock signal (CLK) In this mode the read of pixel data

                                              does not depend on the status of the hvcnt signal received from the

                                              Synchronization Unit

                                              93 The RGB register (RGBREG)

                                              The RGB register is used to store the six bits that are required for

                                              driving the red green and blue color guns of the target monitor When the

                                              chosen color palette is either Black amp White or 16 Colors these six bits are

                                              obtained by mapping the value of the active pixel to a predefined RGB code

                                              When the chosen color palette is 64 Colors the actual pixel value is used

                                              directly

                                              The register RGBREG is reset to zero (000000) when the VGA

                                              Controller receives an external reset signal (RST) This RGB code represents

                                              black

                                              The register is updated on each rising edge of the CLK signal and the

                                              value loaded is dependent on the state of the line display enable signal en

                                              When en is Low blanking is required and RGBREG is loaded with the code

                                              for black (000000)

                                              Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                              and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                              25

                                              26

                                              27

                                              The RGB color code stored in the RGB register is output from the VGA

                                              Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                              and B1)

                                              The monitor itself expects analog signals as inputs to its electron gun control

                                              circuits This is achieved by using 2-bit digital to analog converter circuitry

                                              located on the Nano Board itself as shown in Figure 4

                                              Figure 91 Digital to Analog RGB conversion

                                              28

                                              For each color the 2-bit digital signal from the VGA Controller can be

                                              converted into 4 distinct analog levels These levels specify the intensity of

                                              each of the three primary colors to use when displaying the pixel on the

                                              monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                              brightness)

                                              With each analog input being one of four possible levels the monitor

                                              can display each pixel on the screen with one of 64 different color

                                              permutations

                                              Using Block RAM in Spartan-3E Generation

                                              FPGAs

                                              For applications requiring large on-chip memories Spartantrade-3

                                              Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                              Using various configuration options Select RAM blocks create RAM ROM

                                              FIFOs large look-up tables data width converterscircular buffers and shift

                                              registers each supporting various data widths and depths This application note

                                              describes the features and capabilities of block Select RAM and illustrates how

                                              to specify the various options using the Xilinx CORE Generatortrade system or

                                              via VHDL or Verilog instantiation

                                              29

                                              Table 2 Block RAM Available in Spartan-3E Devices

                                              Devi

                                              ce

                                              R

                                              A

                                              M

                                              Co

                                              lu

                                              m

                                              ns

                                              R

                                              A

                                              M

                                              Bl

                                              oc

                                              ks

                                              pe

                                              r

                                              C

                                              ol

                                              u

                                              m

                                              n

                                              T

                                              o

                                              t

                                              a

                                              l

                                              R

                                              A

                                              M

                                              B

                                              l

                                              o

                                              c

                                              k

                                              s

                                              T

                                              o

                                              t

                                              a

                                              l

                                              R

                                              A

                                              M

                                              B

                                              it

                                              s

                                              T

                                              o

                                              t

                                              a

                                              l

                                              R

                                              A

                                              M

                                              K

                                              b

                                              i

                                              t

                                              s

                                              XC3

                                              S100

                                              E

                                              1 4 4 7

                                              3

                                              7

                                              2

                                              8

                                              7

                                              2

                                              XC3

                                              S250

                                              E

                                              2 6 1

                                              2

                                              2

                                              2

                                              1

                                              1

                                              8

                                              4

                                              2

                                              1

                                              6

                                              XC3

                                              S500

                                              E

                                              2 10 2

                                              0

                                              3

                                              6

                                              8

                                              6

                                              4

                                              3

                                              6

                                              0

                                              30

                                              0

                                              XC3

                                              S120

                                              0E

                                              2 14 2

                                              8

                                              5

                                              1

                                              6

                                              0

                                              9

                                              6

                                              5

                                              0

                                              4

                                              XC3

                                              S160

                                              0E

                                              2 18 3

                                              6

                                              6

                                              6

                                              3

                                              5

                                              5

                                              2

                                              6

                                              4

                                              8

                                              Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                              which is allocated to data storage and in some memory configurations an

                                              additional 2K bits allocated to parity or additional plus data bits Physically

                                              the block RAM memory has two completely independent access ports labeled

                                              Port A and Port B The structure is fully symmetrical and both ports are

                                              interchangeable and both ports support data read and write operations Each

                                              memory port is synchronous with its own clock clock enable and write

                                              enable Read operations are also synchronous and require a clock edge and

                                              clock enable Though physically a dual-port memory block RAM simulates

                                              single-port memory in an application as shown in Figure 1 Furthermore each

                                              block memory supports multiple configurations or aspect ratios Table 3

                                              summarizes the essential SelectRAM features Cascade multiple block RAMs

                                              to create deeper and wider memory organizations with a minimal timing

                                              penalty incurred through specialized routing resources

                                              31

                                              32

                                              10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                              Why (V) HDL

                                              Interoperability

                                              Technology independence

                                              Design reuse

                                              Several levels of abstraction

                                              Readability

                                              Standard language

                                              Widely supported

                                              What is VHDL

                                              VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                              IC)

                                              Design specification language

                                              Design entry language

                                              Design simulation language

                                              Design documentation language

                                              An alternative to schematics

                                              101 Brief History

                                              VHDL Was developed in the early 1980s for managing design problems that

                                              involved large circuits and multiple teams of engineers

                                              Funded by US Department of Defence

                                              33

                                              The first publicly available version was released in 1985

                                              In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                              presented with a proposal to standardize the VHDL

                                              In 1987 standardization =gt IEEE 1076-1987

                                              An improved version of the language was released in 1994 =gt IEEE

                                              standard1076-1993

                                              102 Related Standards

                                              IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                              impedance

                                              Soon after IEEE 1076-1987 was released simulator companies began using

                                              their own non-standard types =gt VHDL was becoming a nonstandard

                                              IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                              definitions for a nine-valued data type std_logic

                                              IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                              to actual hardware

                                              Defines eg two numeric types signed and unsigned

                                              VHDL Environment

                                              Design Units

                                              34

                                              Segments of VHDL code that can be compiled separately and stored in a

                                              library

                                              Entities

                                              A black box with interface definition

                                              Defines the inputsoutputs of a component (define pins)

                                              A way to represent modularity in VHDL

                                              Similar to symbol in schematic

                                              Entity declaration describes entity

                                              Eg

                                              entity Comparator is

                                              port (A B in std_logic_vector(7 downto0)

                                              EQ out std_logic)

                                              end Comparator

                                              Ports Provide channels of communication between the component and its

                                              environment

                                              Each port must have a name direction and a type

                                              An entity may have NO port declaration

                                              35

                                              Port directions In A value of a port can be read inside the component but cannot be assigned

                                              Multiple reads of port are allowed

                                              Out Assignments can be made to a port but data from a port cannot be read

                                              Multiple assignments are allowed

                                              In out Bi-directional assignments can be made and data can be read Multiple

                                              assignments are allowed

                                              Buffer An out port with read capability May have at most one assignment

                                              (are not recommended)

                                              Architectures Every entity has at least one architecture

                                              One entity can have several architectures

                                              Architectures can describe design using

                                              BehaviorndashStructurendashDataflow

                                              Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                              Transfer Level)ndashBehavioral level

                                              Configuration declaration links architecture to entity

                                              Eg

                                              Architecture Comparator1 of Comparator is

                                              Begin

                                              EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                              End Comparator1

                                              Configurations Links entity declaration and architecture body together

                                              Concept of default configuration is a bit messy in VHDL lsquo87

                                              ndashLast architecture analyzed links to entity

                                              Can be used to change simulation behavior without re-analyzing the VHDL

                                              source

                                              Complex configuration declarations are ignored in synthesis

                                              36

                                              Some entities can have eggate level architecture and behavioral

                                              architecture

                                              Are always optional

                                              PackagesPackages contain information common to many design units

                                              1 Package declaration

                                              --constant declarations

                                              ndashtype and subtype declarations

                                              ndashfunction and procedure declarations

                                              ndashglobal signal declarations

                                              ndashfile declarations

                                              ndashcomponent declarations

                                              2 Package body

                                              ndashis not necessary needed

                                              ndashfunction bodies

                                              ndashprocedure bodies

                                              Packages are meant for encapsuling data which can be shared globally among

                                              Several design units

                                              Consists of declaration part and optional body part

                                              Package declaration can contain

                                              ndashtype and subtype declarations

                                              ndashsubprograms

                                              ndashconstants

                                              Alias declarations

                                              ndashglobal signal declarations

                                              ndashfile declarations

                                              ndashcomponent declarations

                                              37

                                              Package body consists of

                                              ndashsubprogram declarations and bodies

                                              ndashtype and subtype declarations

                                              ndash deferred constants

                                              ndash file declarations

                                              LibrariesCollection of VHDL design units (database)

                                              1 Packages

                                              package declaration

                                              package body

                                              2 Entities (entity declaration)

                                              3 Architectures (architecture body)

                                              4 Configurations (configuration declarations)

                                              Usually directory in UNIX file system

                                              Can be also any other kind of database

                                              Levels of Abstraction

                                              VHDL supports many possible styles of design description which differ

                                              primarily in how closely they relate to the HW

                                              It is possible to describe a circuit in a number of ways

                                              Structural-------

                                              Dataflow ------- Higher level of abstraction

                                              Behavioral -------

                                              Structural VHDL description

                                              Circuit is described in terms of its components

                                              From a low-level description (eg transistor-level description) to a high level

                                              description (eg block diagram)

                                              For large circuits a low-level description quickly becomes impractical

                                              103 Dataflow VHDL Description

                                              38

                                              Circuit is described in terms of how data moves through the system

                                              In the dataflow style you describe how information flows between registers in

                                              the system

                                              The combinational logic is described at a relatively high level the placement

                                              and operation of registers is specified quite precisely

                                              The behavior of the system over the time is defined by registers

                                              There are no build-in registers in VHDL-language

                                              ndashEither lower level description

                                              ndashor behavioral description of sequential elements is needed

                                              The lower level register descriptions must be created or obtained

                                              If there is no 3rd party models for registers =gt you must write the behavioral

                                              description of registers

                                              The behavioral description can be provided in the form of

                                              subprograms(functions or procedures)

                                              104 Behavioral VHDL Description

                                              Circuit is described in terms of its operation over time

                                              39

                                              Representation might include eg state diagrams timing diagrams and

                                              algorithmic descriptions

                                              The concept of time may be expressed precisely using delays (eg A lt= B

                                              after 10 ns)

                                              If no actual delay is used order of sequential operations is defined

                                              In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                              timing specifications

                                              The actual timing results depend on implementation technology and efficiency

                                              of synthesis tool

                                              There are a few tools for behavioral synthesis

                                              Concurrent Vs Sequential

                                              Processes

                                              Basic simulation concept in VHDL

                                              VHDL description can always be broken up to interconnected processes

                                              Quite similar to Unix process

                                              40

                                              Process keyword in VHDL

                                              Process statement is concurrent statement

                                              Statements inside process statements are sequential statements

                                              Process must contain either sensitivity list or wait statement(s) but NOT both

                                              Sensitivity list or wait statement(s) contains signals which wakes process up

                                              General format

                                              Process [(sensitivity_list)]

                                              process_declarative_part

                                              Begin

                                              process_statements

                                              [wait_statement]

                                              End process

                                              41

                                              11 VGA CONTROLLER CODE

                                              library IEEE

                                              use IEEESTD_LOGIC_1164ALL

                                              use IEEESTD_LOGIC_ARITHALL

                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                              use IEEEnumeric_stdALL

                                              ---- Uncomment the following library declaration if instantiating

                                              ---- any Xilinx primitives in this code

                                              --library UNISIM

                                              --use UNISIMVComponentsall

                                              entity vga_controller is port(clk50min std_logic

                                              rstnin std_logic

                                              hsyncout std_logic

                                              vsyncout std_logic

                                              redout std_logic_vector(0 downto 0)

                                              greenout std_logic_vector(0 downto 0)

                                              blueout std_logic_vector(0 downto 0))

                                              end vga_controller

                                              architecture Behavioral of vga_controller is

                                              component memory

                                              port (

                                              addr IN std_logic_VECTOR(17 downto 0)

                                              clk IN std_logic

                                              dout OUT std_logic_VECTOR(0 downto 0)

                                              en IN std_logic

                                              sinit IN std_logic)

                                              END component

                                              42

                                              component sync_unit port(

                                              clkin std_logic

                                              rstnin std_logic

                                              hsyncout std_logic

                                              ---horizontal synch pulse(row)

                                              vsyncout std_logic

                                              --vertical sync pulse(frame) 111110100000000000

                                              v_en_regout std_logic

                                              h_enout std_logic

                                              v_enout std_logic

                                              line_countinout integer

                                              c_hsinout integer)

                                              end component

                                              component pixelgen port(

                                              clkin std_logic

                                              rstnin std_logic

                                              hsyncin std_logic

                                              vsyncin std_logic

                                              h_enin std_logic

                                              v_enin std_logic

                                              v_en_reg in std_logic

                                              data_inin std_logic_vector(0 downto 0)

                                              line_countin integer

                                              c_hsin integer

                                              red out std_logic_vector(0 downto 0)

                                              greenout std_logic_vector(0 downto 0)

                                              blueout std_logic_vector(0 downto 0)

                                              addrout std_logic_vector(17 downto 0)

                                              rdout std_logic)

                                              end component

                                              43

                                              signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                              signal line_count_sc_hs_saddrinteger

                                              signal data_sstd_logic_vector(0 downto 0)

                                              signal addr_sstd_logic_vector(17 downto 0)

                                              begin

                                              --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                              process(clk50mrstn)

                                              begin

                                              if(rstn = 0)then

                                              clk lt= 0

                                              elsif(clk50m = 1 and clk50mevent)then

                                              clk lt= not clk

                                              end if

                                              --end if

                                              end process

                                              hsync lt= hsync_s

                                              vsync lt= vsync_s

                                              sync sync_unit port map(

                                              clk =gt clk

                                              rstn =gt rstn

                                              hsync =gt hsync_s

                                              vsync =gt vsync_s

                                              v_en_reg =gt v_en_reg_s

                                              line_count =gt line_count_s

                                              h_en =gt h_en_s

                                              v_en =gt v_en_s

                                              c_hs =gt c_hs_s

                                              )

                                              44

                                              pixel pixelgen port map(

                                              clk =gt clk

                                              rstn =gt rstn

                                              hsync =gt hsync_s

                                              vsync =gt vsync_s

                                              h_en =gt h_en_s

                                              v_en =gt v_en_s

                                              v_en_reg =gt v_en_reg_s

                                              data_in =gt data_s

                                              line_count=gt line_count_s

                                              c_hs =gt c_hs_s

                                              red =gt red

                                              green =gt green

                                              blue =gt blue

                                              addr =gt addr_s

                                              rd =gt rd_s

                                              )

                                              mem12 memory port map(

                                              addr =gt addr_s

                                              clk =gt clk

                                              dout =gt data_s

                                              en =gt rd_s

                                              sinit=gt rstn

                                              )

                                              end Behavioral

                                              45

                                              SYNCHRONIZATION BLOCK CODE

                                              library IEEE

                                              use IEEESTD_LOGIC_1164ALL

                                              use IEEESTD_LOGIC_ARITHALL

                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                              entity sync_unit is port(

                                              clkin std_logic

                                              rstnin std_logic

                                              hsyncout std_logic ---horizontal synch pulse(row)

                                              vsyncout std_logic --vertical sync

                                              pulse(frame) 111110100000000000

                                              v_en_regout std_logic

                                              h_enout std_logic

                                              v_enout std_logic

                                              line_countout integer

                                              c_hsout integer )

                                              end sync_unit

                                              architecture arch_sync_unit of sync_unit is

                                              signal h_en_sv_en_sstd_logic

                                              signal line_count_sc_hs_sinteger

                                              begin

                                              c_hs lt= c_hs_s

                                              line_count lt= line_count_s

                                              h_en lt= h_en_s

                                              v_en lt= v_en_s

                                              -- hsync counter

                                              process(clkrstn)

                                              begin

                                              if(rstn = 0)then

                                              46

                                              c_hs_s lt= 0

                                              else

                                              if(clk = 1 and clkevent)then

                                              if(c_hs_s=793)then

                                              c_hs_s lt= 0

                                              else

                                              c_hs_s lt= c_hs_s +1

                                              end if

                                              end if

                                              end if

                                              end process

                                              --vsync counter

                                              --vertical line counter

                                              process(h_en_srstn)

                                              begin

                                              if(rstn = 0)then

                                              line_count_s lt= 0

                                              elsif(h_en_s = 1 and h_en_sevent)then

                                              line_count_s lt= line_count_s + 1

                                              if (line_count_s=515)then

                                              line_count_s lt= 0

                                              end if

                                              end if

                                              --end if

                                              end process

                                              --hysnc pulse

                                              process(clkrstn)

                                              begin

                                              if(rstn = 0)then

                                              47

                                              hsync lt= 1

                                              elsif(clk = 1 and clkevent)then

                                              if(c_hs_s lt=95)then

                                              hsync lt= 0

                                              else

                                              hsync lt= 1

                                              end if

                                              end if

                                              --end if

                                              end process

                                              process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                              this period)

                                              begin

                                              if(rstn = 0)then

                                              h_en_s lt= 0

                                              elsif(clk = 1 and clkevent)then

                                              if((c_hs_s gt=143) and (c_hs_slt=778))then

                                              h_en_s lt= 1

                                              else

                                              h_en_s lt= 0

                                              end if

                                              end if

                                              --end if

                                              end process

                                              ---vysnc pulse

                                              process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                              begin

                                              if(rstn = 0)then

                                              48

                                              v_en_s lt= 0

                                              elsif(clk = 1 and clkevent)then

                                              if((line_count_sgt=35 and line_count_slt=515))then

                                              v_en_s lt= 1

                                              else

                                              v_en_s lt= 0

                                              end if

                                              end if

                                              --end if

                                              end process

                                              process(clkrstn) --Vertical Sync Front porch Back Porch

                                              begin

                                              if(rstn = 0)then

                                              vsync lt= 1

                                              elsif(clk = 1 and clkevent)then

                                              if(line_count_s lt2)then

                                              vsync lt= 0

                                              else

                                              vsync lt= 1

                                              end if

                                              end if

                                              --end if

                                              end process

                                              process(clk)

                                              begin

                                              if(clk = 1 and clkevent)then

                                              v_en_reglt=v_en_s

                                              end if

                                              end process

                                              end arch_sync_unit

                                              49

                                              PIXEL GENERATOR

                                              library IEEE

                                              use IEEESTD_LOGIC_1164ALL

                                              use IEEESTD_LOGIC_ARITHALL

                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                              entity pixelgen is port(

                                              clkin std_logic

                                              rstnin std_logic

                                              hsyncin std_logic

                                              vsyncin std_logic

                                              h_enin std_logic

                                              v_enin std_logic

                                              v_en_reg in std_logic

                                              data_inin std_logic_vector(0 downto 0)

                                              line_countin integer

                                              c_hsin integer

                                              red out std_logic_vector(0 downto 0)

                                              greenout std_logic_vector(0 downto 0)

                                              blueout std_logic_vector(0 downto 0)

                                              addrout std_logic_vector(17 downto 0)

                                              rdout std_logic

                                              )

                                              end pixelgen

                                              architecture behv of pixelgen is

                                              signal addr_sstd_logic_vector(17 downto 0)

                                              begin

                                              addr lt= addr_s

                                              process(clkrstn)

                                              begin

                                              50

                                              if(rstn = 0)then

                                              rd lt= 0

                                              red lt= 0

                                              green lt= 0

                                              blue lt= 0

                                              addr_s lt= x0000 amp 00

                                              elsif (clk = 1 and clkevent)then

                                              if((v_en=1) and (v_en_reg=0))then

                                              addr_slt=(others =gt0)

                                              end if

                                              if(v_en=1 and h_en=1)then

                                              if(line_countlt=514)then

                                              if(c_hs lt= 655)then

                                              rd lt= 1

                                              red lt= data_in

                                              green lt= data_in

                                              blue lt= data_in

                                              addr_s lt=addr_s +000000000000000001

                                              else

                                              rd lt= 0

                                              red lt= 0

                                              green lt= 1

                                              blue lt= 0

                                              end if

                                              else

                                              rd lt= 0

                                              red lt= 1

                                              green lt= 1

                                              blue lt= 1

                                              51

                                              end if

                                              if(addr_s=111011111111111111)then

                                              addr_slt= (others =gt0)

                                              end if

                                              else

                                              red lt= 0

                                              green lt= 0

                                              blue lt= 0

                                              end if

                                              end if

                                              -- end if

                                              --end if

                                              end process

                                              end behv

                                              52

                                              12 FPGA ARCHITECTURE

                                              A field-programmable gate array (FPGA) is a semiconductor device that can

                                              be configured by the customer or designer after manufacturingmdashhence the

                                              name field-programmable FPGAs are programmed using a logic circuit

                                              diagram or a source code in a hardware description language (HDL) to specify

                                              how the chip will work They can be used to implement any logical function

                                              that an application-specific integrated circuit (ASIC) could perform but the

                                              ability to update the functionality after shipping offers advantages for many

                                              applications

                                              FPGAs contain programmable logic components called logic blocks and a

                                              hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                              togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                              can be configured to perform complex combinational functions or merely

                                              simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                              include memory elements which may be simple flip-flops or more complete

                                              blocks of memory

                                              History The FPGA industry sprouted from programmable read only

                                              memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                              both had the option of being programmed in batches in a factory or in the field

                                              (field programmable) however programmable logic was hard-wired between

                                              logic gates

                                              Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                              first commercially viable field programmable gate array in 1985 ndash the

                                              XC2064 The XC2064 had programmable gates and programmable

                                              interconnects between gates the beginnings of a new technology and market

                                              The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                              input lookup tables (LUTs) More than 20 years later Freeman was entered

                                              into the National Inventors Hall of Fame for his invention

                                              53

                                              Some of the industryrsquos foundational concepts and technologies for

                                              programmable logic arrays gates and logic blocks are founded in patents

                                              awarded to David W Page and LuVerne R Peterson in 1985

                                              In the late 1980s the Naval Surface Warfare Department funded an experiment

                                              proposed by Steve Casselman to develop a computer that would implement

                                              600000 reprogrammable gates Casselman was successful and the system was

                                              awarded a patent in 1992

                                              Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                              1990s when competitors sprouted up eroding significant market-share By

                                              1993 Actel was serving about 18 percent of the market

                                              The 1990s were an explosive period of time for FPGAs both in sophistication

                                              and the volume of production In the early 1990s FPGAs were primarily used

                                              in telecommunications and networking By the end of the decade FPGAs

                                              found their way into consumer automotive and industrial applications

                                              FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                              algorithm technology and FPGAs to create a sound recognition device

                                              Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                              to decide the configuration needed to accomplish a sound recognition task

                                              121 Modern developments

                                              A recent trend has been to take the coarse-grained architectural

                                              approach a step further by combining the logic blocks and interconnects of

                                              traditional FPGAs with embedded microprocessors and related peripherals to

                                              form a complete system on a programmable chip This work mirrors the

                                              architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                              Group which combined a reconfigurable CPU architecture on a single chip

                                              called the SB24 That work was done in 1982 Examples of such hybrid

                                              technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                              which include one or more PowerPC processors embedded within the FPGAs

                                              54

                                              logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                              processor in combination with Atmels programmable logic architecture

                                              An alternate approach to using hard-macro processors is to make use of soft

                                              processor cores that are implemented within the FPGA logic (See Soft

                                              processors below)

                                              As previously mentioned many modern FPGAs have the ability to be

                                              reprogrammed at run time and this is leading to the idea of reconfigurable

                                              computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                              suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                              example of a reconfigurable soft processor implemented on FPGAs However

                                              it does not support dynamic reconfiguration at runtime but instead adapts itself

                                              to a specific program

                                              Additionally new non-FPGA architectures are beginning to emerge

                                              Software-configurable microprocessors such as the Stretch S5000 adopt a

                                              hybrid approach by providing an array of processor cores and FPGA-like

                                              programmable cores on the same chip

                                              Gates

                                              1987 9000 gates Xilinx

                                              1992 600000 Naval Surface Warfare Department

                                              Early 2000s Millions

                                              Market size

                                              1985 First commercial FPGA technology invented by Xilinx

                                              1987 $14 million

                                              ~1993 gt$385 million

                                              2005 $19 billion

                                              2010 estimates $275 billion

                                              55

                                              122 FPGA Comparisons

                                              Historically FPGAs have been slower less energy efficient and generally

                                              achieved less functionality than their fixed ASIC counterparts A combination

                                              of volume fabrication improvements research and development and the IO

                                              capabilities of new supercomputers have largely closed the performance gap

                                              between ASICs and FPGAs

                                              Advantages include a shorter time to market ability to re-program in the field

                                              to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                              middle road by developing their hardware on ordinary FPGAs but

                                              manufacture their final version so it can no longer be modified after the design

                                              has been committed

                                              Xilinx claims that several market and technology dynamics are changing the

                                              ASICFPGA paradigm

                                              IC costs are rising aggressively

                                              ASIC complexity has bolstered development time and costs

                                              RampD resources and headcount is decreasing

                                              Revenue losses for slow time-to-market are increasing

                                              Financial constraints in a poor economy are driving low-cost technologies

                                              These trends make FPGAs a better alternative than ASICs for a growing

                                              number of higher-volume applications than they have been historically used

                                              for which the company blames for the growing number of FPGA design starts

                                              (see History)

                                              The primary differences between CPLDs and FPGAs are architectural A

                                              CPLD has a somewhat restrictive structure consisting of one or more

                                              programmable sum-of-products logic arrays feeding a relatively small number

                                              of clocked registers The result of this is less flexibility with the advantage of

                                              more predictable timing delays and a higher logic-to-interconnect ratio The

                                              FPGA architectures on the other hand are dominated by interconnect This

                                              56

                                              makes them far more flexible (in terms of the range of designs that are

                                              practical for implementation within them) but also far more complex to design

                                              for

                                              Another notable difference between CPLDs and FPGAs is the presence in most

                                              FPGAs of higher-level embedded functions (such as adders and multipliers)

                                              and embedded memories as well as to have logic blocks implement decoders

                                              or mathematical functions

                                              Some FPGAs have the capability of partial re-configuration that lets one

                                              portion of the device be re-programmed while other portions continue running

                                              123 Applications

                                              Applications of FPGAs include digital signal processing software-defined

                                              radio aerospace and defense systems ASIC prototyping medical imaging

                                              computer vision speech recognition cryptography bioinformatics computer

                                              hardware emulation radio astronomy and a growing range of other areas

                                              FPGAs originally began as competitors to CPLDs and competed in a similar

                                              space that of glue logic for PCBs As their size capabilities and speed

                                              increased they began to take over larger and larger functions to the state where

                                              some are now marketed as full systems on chips (SoC) Particularly with the

                                              introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                              applications which had traditionally been the sole reserve of DSPs began to

                                              incorporate FPGAs instead

                                              FPGAs especially find applications in any area or algorithm that can make use

                                              of the massive parallelism offered by their architecture One such area is code

                                              breaking in particular brute-force attack of cryptographic algorithms

                                              FPGAs are increasingly used in conventional high performance computing

                                              applications where computational kernels such as FFT or Convolution are

                                              performed on the FPGA instead of a microprocessor

                                              57

                                              The inherent parallelism of the logic resources on an FPGA allows for

                                              considerable computational throughput even at a low MHz clock rates The

                                              flexibility of the FPGA allows for even higher performance by trading off

                                              precision and range in the number format for an increased number of parallel

                                              arithmetic units This has driven a new type of processing called

                                              reconfigurable computing where time intensive tasks are offloaded from

                                              software to FPGAs

                                              The adoption of FPGAs in high performance computing is currently limited by

                                              the complexity of FPGA design compared to conventional software and the

                                              extremely long turn-around times of current design tools where 4-8 hours wait

                                              is necessary after even minor changes to the source code

                                              Traditionally FPGAs have been reserved for specific vertical applications

                                              where the volume of production is small For these low-volume applications

                                              the premium that companies pay in hardware costs per unit for a

                                              programmable chip is more affordable than the development resources spent

                                              on creating an ASIC for a low-volume application Today new cost and

                                              performance dynamics have broadened the range of viable applications

                                              Architecture

                                              The most common FPGA architecture consists of an array of configurable

                                              logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                              channels have the same width (number of wires) Multiple IO pads may fit

                                              into the height of one row or the width of one column in the array

                                              An application circuit must be mapped into an FPGA with adequate resources

                                              While the number of CLBs and IOs required is easily determined from the

                                              design the number of routing tracks needed may vary considerably even

                                              among designs with the same amount of logic (For example a crossbar switch

                                              requires much more routing than a systolic array with the same gate count)

                                              Since unused routing tracks increase the cost (and decrease the performance)

                                              of the part without providing any benefit FPGA manufacturers try to provide

                                              58

                                              just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                              can be routed This is determined by estimates such as those derived from

                                              Rents rule or by experiments with existing designs

                                              The FPGA is an array or island-style FPGA It consists of an array of logic

                                              blocks and routing channels Two IO pads fit into the height of one row or the

                                              width of one column as shown below All the routing channels have the same

                                              width (number of wires)

                                              121 FPGA structure

                                              A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                              flip-flop as shown below In recent years manufacturers have started moving

                                              to 6-input LUTs in their high performance parts claiming increased

                                              performance

                                              Typical logic block

                                              59

                                              There is only one output which can be either the registered or the unregistered

                                              LUT output The logic block has four inputs for the LUT and a clock input

                                              Since clock signals (and often other high-fanout signals) are normally routed

                                              via special-purpose dedicated routing networks in commercial FPGAs they

                                              and other signals are separately managed

                                              For this example architecture the locations of the FPGA logic block pins are

                                              shown below

                                              Logic Block Pin Locations

                                              Each input is accessible from one side of the logic block while the output pin

                                              can connect to routing wires in both the channel to the right and the channel

                                              below the logic block

                                              Each logic block output pin can connect to any of the wiring segments in the

                                              channels adjacent to it

                                              Similarly an IO pad can connect to any one of the wiring segments in the

                                              channel adjacent to it For example an IO pad at the top of the chip can

                                              connect to any of the W wires (where W is the channel width) in the horizontal

                                              channel immediately below it

                                              Generally the FPGA routing is unsegmented That is each wiring segment

                                              spans only one logic block before it terminates in a switch box By turning on

                                              60

                                              some of the programmable switches within a switch box longer paths can be

                                              constructed For higher speed interconnect some FPGA architectures use

                                              longer routing lines that span multiple logic blocks

                                              Whenever a vertical and a horizontal channel intersect there is a switch box In

                                              this architecture when a wire enters a switch box there are three

                                              programmable switches that allow it to connect to three other wires in adjacent

                                              channel segments The pattern or topology of switches used in this

                                              architecture is the planar or domain-based switch box topology In this switch

                                              box topology a wire in track number one connects only to wires in track

                                              number one in adjacent channel segments wires in track number 2 connect

                                              only to other wires in track number 2 and so on The figure below illustrates

                                              the connections in a switch box

                                              Switch box topology

                                              Modern FPGA families expand upon the above capabilities to include higher

                                              level functionality fixed into the silicon Having these common functions

                                              embedded into the silicon reduces the area required and gives those functions

                                              increased speed compared to building them from primitives Examples of these

                                              include multipliers generic DSP blocks embedded processors high speed IO

                                              logic and embedded memories

                                              61

                                              FPGAs are also widely used for systems validation including pre-silicon

                                              validation post-silicon validation and firmware development This allows chip

                                              companies to validate their design before the chip is produced in the factory

                                              reducing the time to market

                                              124 FPGA Design and Programming

                                              To define the behavior of the FPGA the user provides a hardware description

                                              language (HDL) or a schematic design The HDL form might be easier to work

                                              with when handling large structures because its possible to just specify them

                                              numerically rather than having to draw every piece by hand On the other hand

                                              schematic entry can allow for easier visualisation of a design

                                              Then using an electronic design automation tool a technology-mapped netlist

                                              is generated The netlist can then be fitted to the actual FPGA architecture

                                              using a process called place-and-route usually performed by the FPGA

                                              companys proprietary place-and-route software The user will validate the

                                              map place and route results via timing analysis simulation and other

                                              verification methodologies Once the design and validation process is

                                              complete the binary file generated (also using the FPGA companys

                                              proprietary software) is used to (re)configure the FPGA

                                              Going from schematicHDL source files to actual configuration The source

                                              files are fed to a software suite from the FPGACPLD vendor that through

                                              different steps will produce a file This file is then transferred to the

                                              FPGACPLD via a serial interface (JTAG) or to an external memory device

                                              like an EEPROM

                                              The most common HDLs are VHDL and Verilog although in an attempt to

                                              reduce the complexity of designing in HDLs which have been compared to the

                                              equivalent of assembly languages there are moves to raise the abstraction level

                                              through the introduction of alternative languages

                                              62

                                              To simplify the design of complex systems in FPGAs there exist libraries of

                                              predefined complex functions and circuits that have been tested and optimized

                                              to speed up the design process These predefined circuits are commonly called

                                              IP cores and are available from FPGA vendors and third-party IP suppliers

                                              (rarely free and typically released under proprietary licenses) Other

                                              predefined circuits are available from developer communities such as Open

                                              Cores (typically free and released under the GPL BSD or similar license) and

                                              other sources

                                              In a typical design flow an FPGA application developer will simulate the

                                              design at multiple stages throughout the design process Initially the RTL

                                              description in VHDL or Verilog is simulated by creating test benches to

                                              simulate the system and observe results Then after the synthesis engine has

                                              mapped the design to a netlist the netlist is translated to a gate level

                                              description where simulation is repeated to confirm the synthesis proceeded

                                              without errors Finally the design is laid out in the FPGA at which point

                                              propagation delays can be added and the simulation run again with these

                                              values back-annotated onto the netlist

                                              Basic Process Technology Types

                                              SRAM - based on static memory technology In-system programmable and re-

                                              programmable Requires external boot devices CMOS

                                              Antifuse - One-time programmable CMOS

                                              EPROM - Erasable Programmable Read-Only Memory technology Usually

                                              one-time programmable in production because of plastic packaging

                                              Windowed devices can be erased with ultraviolet (UV) light CMOS

                                              EEPROM - Electrically Erasable Programmable Read-Only Memory

                                              technology Can be erased even in plastic packages Some but not all

                                              EEPROM devices can be in-system programmed CMOS

                                              Flash - Flash-erase EPROM technology Can be erased even in plastic

                                              packages Some but not all flash devices can be in-system programmed

                                              63

                                              Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                              therefore less expensive to manufacture CMOS

                                              Fuse - One-time programmable Bipolar

                                              Major Manufacturers

                                              Xilinx and Altera are the current FPGA market leaders and long-time industry

                                              rivals Together they control over 80 percent of the market with Xilinx alone

                                              representing over 50 percent

                                              Xilinx also provides free Windows and Linux design software while Altera

                                              provides free Windows tools the Solaris and Linux tools are only available via

                                              a rental scheme

                                              Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                              (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                              Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                              focused CSSP no general purpose FPGAs)

                                              125 FPGA prototype

                                              FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                              prototyping is the method to prototype SoC and ASIC design on FPGA for

                                              hardware verification and early software development

                                              Main stream verification methods for hardware design and early software and

                                              firmware co-design has become mainstream Prototyping SoC and ASIC

                                              design on FPGA has become a good method to do this

                                              64

                                              Reasons why Prototyping is important

                                              1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                              functionally correct This is compared to designers only relying on software

                                              simulations to verify that their hardware design is sound Simulation speed and

                                              modeling accuracy limitations hinder this development

                                              2 Due to time constrains many projects cannot wait until the silicon is back from

                                              the foundry to start on software tests FPGA prototyping allows for much more

                                              time in area of software development and testing at the software-hardware

                                              integration stage This allows many unforeseen software bugs that appear due

                                              to todays array of operating systems applications and hardware

                                              3 Prototyping also allows the developer to ensure that all IP technologies on his

                                              system work well together off the simulation stage and in actual form

                                              4 Prototyping has the added advantage as demo platforms to SoC clients

                                              bringing in interest early This speeds up the overall development cycle and

                                              allows for more enhancement or improvement to the chip features as it would

                                              otherwise have been

                                              65

                                              13 SIMULATION RESULTS

                                              66

                                              GATE LEVEL

                                              FigVGA CONTROLLER

                                              67

                                              GATE LEVEL

                                              Fig SYNC UNIT

                                              68

                                              TECHNOLOGY SCHEMATIC

                                              69

                                              14 APPLICATIONS

                                              Motion Tracking Systems

                                              Image processing systems

                                              Displaying Systems

                                              141 CONCLUTION

                                              Hence by implementing FPGA taking a processor-generated picture

                                              (pixilated) from memory space the Controller provides digital RGB values for

                                              each pixel as well as horizontal and vertical synchronization signals in order

                                              to correctly display the picture on a connected monitor

                                              70

                                              REFERENCES

                                              1 httpenwikipediaorgwikiVideo_Graphics_Array

                                              2 A VGA display controller by Eduardo Sanchez

                                              3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                              71

                                              • Pixel
                                                • 6 SIGNALS IN VGA CONTROLLER
                                                • Gates
                                                • Market size
                                                • 122 FPGA Comparisons

                                                The register is updated on each rising edge of the CLK signal Data can be

                                                read from the memory space as long as the RD signal is active (High) The RD

                                                signal is itself controlled by the external line display enable signal enex This

                                                internally generated signal is defined as follows

                                                If ((HCOUNT gt (PixTotal+1)) and (HCOUNT lt Line Period)) or ((VCOUNT gt

                                                (LinTotal+1)) and (VCOUNT lt Frame Period)) then enex = 0

                                                Else

                                                enex = 1

                                                When enex is Low read access from memory is disabled (RD = 0)

                                                The point at which data is loaded from memory into PIXREG depends on the

                                                particular color palette that is chosen ndash Black amp White 16 Colors or 64

                                                Colors

                                                Black amp White mode

                                                The next byte of data will be loaded into the register whenever the

                                                lowest three bits of the hvcnt signal ndash received from the Synchronization Unit -

                                                are ldquo000rdquo

                                                For the currently loaded byte the active pixel is always in the lowest bit

                                                position of the Pixel register Each pixel in the data byte is moved into this

                                                active pixel position by shifting the contents of the register right by one bit on

                                                each rising edge of CLK

                                                16 Colors mode

                                                The next byte of data will be loaded into the register whenever the

                                                lowest bit of the hvcnt signal ndash received from the Synchronization Unit - is a

                                                0

                                                For the currently loaded byte the active pixel is always in the low order

                                                nibble of the Pixel register Remember that in this mode each byte of data

                                                contains two pixels The second pixel is moved into this active pixel position

                                                by shifting the contents of the register right by four bits on the rising edge of

                                                CLK

                                                24

                                                64 Colors mode

                                                The next byte of data will be loaded into the register on the rising edge

                                                of the external system clock signal (CLK) In this mode the read of pixel data

                                                does not depend on the status of the hvcnt signal received from the

                                                Synchronization Unit

                                                93 The RGB register (RGBREG)

                                                The RGB register is used to store the six bits that are required for

                                                driving the red green and blue color guns of the target monitor When the

                                                chosen color palette is either Black amp White or 16 Colors these six bits are

                                                obtained by mapping the value of the active pixel to a predefined RGB code

                                                When the chosen color palette is 64 Colors the actual pixel value is used

                                                directly

                                                The register RGBREG is reset to zero (000000) when the VGA

                                                Controller receives an external reset signal (RST) This RGB code represents

                                                black

                                                The register is updated on each rising edge of the CLK signal and the

                                                value loaded is dependent on the state of the line display enable signal en

                                                When en is Low blanking is required and RGBREG is loaded with the code

                                                for black (000000)

                                                Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                                and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                                25

                                                26

                                                27

                                                The RGB color code stored in the RGB register is output from the VGA

                                                Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                                and B1)

                                                The monitor itself expects analog signals as inputs to its electron gun control

                                                circuits This is achieved by using 2-bit digital to analog converter circuitry

                                                located on the Nano Board itself as shown in Figure 4

                                                Figure 91 Digital to Analog RGB conversion

                                                28

                                                For each color the 2-bit digital signal from the VGA Controller can be

                                                converted into 4 distinct analog levels These levels specify the intensity of

                                                each of the three primary colors to use when displaying the pixel on the

                                                monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                                brightness)

                                                With each analog input being one of four possible levels the monitor

                                                can display each pixel on the screen with one of 64 different color

                                                permutations

                                                Using Block RAM in Spartan-3E Generation

                                                FPGAs

                                                For applications requiring large on-chip memories Spartantrade-3

                                                Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                                Using various configuration options Select RAM blocks create RAM ROM

                                                FIFOs large look-up tables data width converterscircular buffers and shift

                                                registers each supporting various data widths and depths This application note

                                                describes the features and capabilities of block Select RAM and illustrates how

                                                to specify the various options using the Xilinx CORE Generatortrade system or

                                                via VHDL or Verilog instantiation

                                                29

                                                Table 2 Block RAM Available in Spartan-3E Devices

                                                Devi

                                                ce

                                                R

                                                A

                                                M

                                                Co

                                                lu

                                                m

                                                ns

                                                R

                                                A

                                                M

                                                Bl

                                                oc

                                                ks

                                                pe

                                                r

                                                C

                                                ol

                                                u

                                                m

                                                n

                                                T

                                                o

                                                t

                                                a

                                                l

                                                R

                                                A

                                                M

                                                B

                                                l

                                                o

                                                c

                                                k

                                                s

                                                T

                                                o

                                                t

                                                a

                                                l

                                                R

                                                A

                                                M

                                                B

                                                it

                                                s

                                                T

                                                o

                                                t

                                                a

                                                l

                                                R

                                                A

                                                M

                                                K

                                                b

                                                i

                                                t

                                                s

                                                XC3

                                                S100

                                                E

                                                1 4 4 7

                                                3

                                                7

                                                2

                                                8

                                                7

                                                2

                                                XC3

                                                S250

                                                E

                                                2 6 1

                                                2

                                                2

                                                2

                                                1

                                                1

                                                8

                                                4

                                                2

                                                1

                                                6

                                                XC3

                                                S500

                                                E

                                                2 10 2

                                                0

                                                3

                                                6

                                                8

                                                6

                                                4

                                                3

                                                6

                                                0

                                                30

                                                0

                                                XC3

                                                S120

                                                0E

                                                2 14 2

                                                8

                                                5

                                                1

                                                6

                                                0

                                                9

                                                6

                                                5

                                                0

                                                4

                                                XC3

                                                S160

                                                0E

                                                2 18 3

                                                6

                                                6

                                                6

                                                3

                                                5

                                                5

                                                2

                                                6

                                                4

                                                8

                                                Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                                which is allocated to data storage and in some memory configurations an

                                                additional 2K bits allocated to parity or additional plus data bits Physically

                                                the block RAM memory has two completely independent access ports labeled

                                                Port A and Port B The structure is fully symmetrical and both ports are

                                                interchangeable and both ports support data read and write operations Each

                                                memory port is synchronous with its own clock clock enable and write

                                                enable Read operations are also synchronous and require a clock edge and

                                                clock enable Though physically a dual-port memory block RAM simulates

                                                single-port memory in an application as shown in Figure 1 Furthermore each

                                                block memory supports multiple configurations or aspect ratios Table 3

                                                summarizes the essential SelectRAM features Cascade multiple block RAMs

                                                to create deeper and wider memory organizations with a minimal timing

                                                penalty incurred through specialized routing resources

                                                31

                                                32

                                                10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                Why (V) HDL

                                                Interoperability

                                                Technology independence

                                                Design reuse

                                                Several levels of abstraction

                                                Readability

                                                Standard language

                                                Widely supported

                                                What is VHDL

                                                VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                IC)

                                                Design specification language

                                                Design entry language

                                                Design simulation language

                                                Design documentation language

                                                An alternative to schematics

                                                101 Brief History

                                                VHDL Was developed in the early 1980s for managing design problems that

                                                involved large circuits and multiple teams of engineers

                                                Funded by US Department of Defence

                                                33

                                                The first publicly available version was released in 1985

                                                In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                presented with a proposal to standardize the VHDL

                                                In 1987 standardization =gt IEEE 1076-1987

                                                An improved version of the language was released in 1994 =gt IEEE

                                                standard1076-1993

                                                102 Related Standards

                                                IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                impedance

                                                Soon after IEEE 1076-1987 was released simulator companies began using

                                                their own non-standard types =gt VHDL was becoming a nonstandard

                                                IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                definitions for a nine-valued data type std_logic

                                                IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                to actual hardware

                                                Defines eg two numeric types signed and unsigned

                                                VHDL Environment

                                                Design Units

                                                34

                                                Segments of VHDL code that can be compiled separately and stored in a

                                                library

                                                Entities

                                                A black box with interface definition

                                                Defines the inputsoutputs of a component (define pins)

                                                A way to represent modularity in VHDL

                                                Similar to symbol in schematic

                                                Entity declaration describes entity

                                                Eg

                                                entity Comparator is

                                                port (A B in std_logic_vector(7 downto0)

                                                EQ out std_logic)

                                                end Comparator

                                                Ports Provide channels of communication between the component and its

                                                environment

                                                Each port must have a name direction and a type

                                                An entity may have NO port declaration

                                                35

                                                Port directions In A value of a port can be read inside the component but cannot be assigned

                                                Multiple reads of port are allowed

                                                Out Assignments can be made to a port but data from a port cannot be read

                                                Multiple assignments are allowed

                                                In out Bi-directional assignments can be made and data can be read Multiple

                                                assignments are allowed

                                                Buffer An out port with read capability May have at most one assignment

                                                (are not recommended)

                                                Architectures Every entity has at least one architecture

                                                One entity can have several architectures

                                                Architectures can describe design using

                                                BehaviorndashStructurendashDataflow

                                                Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                Transfer Level)ndashBehavioral level

                                                Configuration declaration links architecture to entity

                                                Eg

                                                Architecture Comparator1 of Comparator is

                                                Begin

                                                EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                End Comparator1

                                                Configurations Links entity declaration and architecture body together

                                                Concept of default configuration is a bit messy in VHDL lsquo87

                                                ndashLast architecture analyzed links to entity

                                                Can be used to change simulation behavior without re-analyzing the VHDL

                                                source

                                                Complex configuration declarations are ignored in synthesis

                                                36

                                                Some entities can have eggate level architecture and behavioral

                                                architecture

                                                Are always optional

                                                PackagesPackages contain information common to many design units

                                                1 Package declaration

                                                --constant declarations

                                                ndashtype and subtype declarations

                                                ndashfunction and procedure declarations

                                                ndashglobal signal declarations

                                                ndashfile declarations

                                                ndashcomponent declarations

                                                2 Package body

                                                ndashis not necessary needed

                                                ndashfunction bodies

                                                ndashprocedure bodies

                                                Packages are meant for encapsuling data which can be shared globally among

                                                Several design units

                                                Consists of declaration part and optional body part

                                                Package declaration can contain

                                                ndashtype and subtype declarations

                                                ndashsubprograms

                                                ndashconstants

                                                Alias declarations

                                                ndashglobal signal declarations

                                                ndashfile declarations

                                                ndashcomponent declarations

                                                37

                                                Package body consists of

                                                ndashsubprogram declarations and bodies

                                                ndashtype and subtype declarations

                                                ndash deferred constants

                                                ndash file declarations

                                                LibrariesCollection of VHDL design units (database)

                                                1 Packages

                                                package declaration

                                                package body

                                                2 Entities (entity declaration)

                                                3 Architectures (architecture body)

                                                4 Configurations (configuration declarations)

                                                Usually directory in UNIX file system

                                                Can be also any other kind of database

                                                Levels of Abstraction

                                                VHDL supports many possible styles of design description which differ

                                                primarily in how closely they relate to the HW

                                                It is possible to describe a circuit in a number of ways

                                                Structural-------

                                                Dataflow ------- Higher level of abstraction

                                                Behavioral -------

                                                Structural VHDL description

                                                Circuit is described in terms of its components

                                                From a low-level description (eg transistor-level description) to a high level

                                                description (eg block diagram)

                                                For large circuits a low-level description quickly becomes impractical

                                                103 Dataflow VHDL Description

                                                38

                                                Circuit is described in terms of how data moves through the system

                                                In the dataflow style you describe how information flows between registers in

                                                the system

                                                The combinational logic is described at a relatively high level the placement

                                                and operation of registers is specified quite precisely

                                                The behavior of the system over the time is defined by registers

                                                There are no build-in registers in VHDL-language

                                                ndashEither lower level description

                                                ndashor behavioral description of sequential elements is needed

                                                The lower level register descriptions must be created or obtained

                                                If there is no 3rd party models for registers =gt you must write the behavioral

                                                description of registers

                                                The behavioral description can be provided in the form of

                                                subprograms(functions or procedures)

                                                104 Behavioral VHDL Description

                                                Circuit is described in terms of its operation over time

                                                39

                                                Representation might include eg state diagrams timing diagrams and

                                                algorithmic descriptions

                                                The concept of time may be expressed precisely using delays (eg A lt= B

                                                after 10 ns)

                                                If no actual delay is used order of sequential operations is defined

                                                In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                timing specifications

                                                The actual timing results depend on implementation technology and efficiency

                                                of synthesis tool

                                                There are a few tools for behavioral synthesis

                                                Concurrent Vs Sequential

                                                Processes

                                                Basic simulation concept in VHDL

                                                VHDL description can always be broken up to interconnected processes

                                                Quite similar to Unix process

                                                40

                                                Process keyword in VHDL

                                                Process statement is concurrent statement

                                                Statements inside process statements are sequential statements

                                                Process must contain either sensitivity list or wait statement(s) but NOT both

                                                Sensitivity list or wait statement(s) contains signals which wakes process up

                                                General format

                                                Process [(sensitivity_list)]

                                                process_declarative_part

                                                Begin

                                                process_statements

                                                [wait_statement]

                                                End process

                                                41

                                                11 VGA CONTROLLER CODE

                                                library IEEE

                                                use IEEESTD_LOGIC_1164ALL

                                                use IEEESTD_LOGIC_ARITHALL

                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                use IEEEnumeric_stdALL

                                                ---- Uncomment the following library declaration if instantiating

                                                ---- any Xilinx primitives in this code

                                                --library UNISIM

                                                --use UNISIMVComponentsall

                                                entity vga_controller is port(clk50min std_logic

                                                rstnin std_logic

                                                hsyncout std_logic

                                                vsyncout std_logic

                                                redout std_logic_vector(0 downto 0)

                                                greenout std_logic_vector(0 downto 0)

                                                blueout std_logic_vector(0 downto 0))

                                                end vga_controller

                                                architecture Behavioral of vga_controller is

                                                component memory

                                                port (

                                                addr IN std_logic_VECTOR(17 downto 0)

                                                clk IN std_logic

                                                dout OUT std_logic_VECTOR(0 downto 0)

                                                en IN std_logic

                                                sinit IN std_logic)

                                                END component

                                                42

                                                component sync_unit port(

                                                clkin std_logic

                                                rstnin std_logic

                                                hsyncout std_logic

                                                ---horizontal synch pulse(row)

                                                vsyncout std_logic

                                                --vertical sync pulse(frame) 111110100000000000

                                                v_en_regout std_logic

                                                h_enout std_logic

                                                v_enout std_logic

                                                line_countinout integer

                                                c_hsinout integer)

                                                end component

                                                component pixelgen port(

                                                clkin std_logic

                                                rstnin std_logic

                                                hsyncin std_logic

                                                vsyncin std_logic

                                                h_enin std_logic

                                                v_enin std_logic

                                                v_en_reg in std_logic

                                                data_inin std_logic_vector(0 downto 0)

                                                line_countin integer

                                                c_hsin integer

                                                red out std_logic_vector(0 downto 0)

                                                greenout std_logic_vector(0 downto 0)

                                                blueout std_logic_vector(0 downto 0)

                                                addrout std_logic_vector(17 downto 0)

                                                rdout std_logic)

                                                end component

                                                43

                                                signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                signal line_count_sc_hs_saddrinteger

                                                signal data_sstd_logic_vector(0 downto 0)

                                                signal addr_sstd_logic_vector(17 downto 0)

                                                begin

                                                --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                process(clk50mrstn)

                                                begin

                                                if(rstn = 0)then

                                                clk lt= 0

                                                elsif(clk50m = 1 and clk50mevent)then

                                                clk lt= not clk

                                                end if

                                                --end if

                                                end process

                                                hsync lt= hsync_s

                                                vsync lt= vsync_s

                                                sync sync_unit port map(

                                                clk =gt clk

                                                rstn =gt rstn

                                                hsync =gt hsync_s

                                                vsync =gt vsync_s

                                                v_en_reg =gt v_en_reg_s

                                                line_count =gt line_count_s

                                                h_en =gt h_en_s

                                                v_en =gt v_en_s

                                                c_hs =gt c_hs_s

                                                )

                                                44

                                                pixel pixelgen port map(

                                                clk =gt clk

                                                rstn =gt rstn

                                                hsync =gt hsync_s

                                                vsync =gt vsync_s

                                                h_en =gt h_en_s

                                                v_en =gt v_en_s

                                                v_en_reg =gt v_en_reg_s

                                                data_in =gt data_s

                                                line_count=gt line_count_s

                                                c_hs =gt c_hs_s

                                                red =gt red

                                                green =gt green

                                                blue =gt blue

                                                addr =gt addr_s

                                                rd =gt rd_s

                                                )

                                                mem12 memory port map(

                                                addr =gt addr_s

                                                clk =gt clk

                                                dout =gt data_s

                                                en =gt rd_s

                                                sinit=gt rstn

                                                )

                                                end Behavioral

                                                45

                                                SYNCHRONIZATION BLOCK CODE

                                                library IEEE

                                                use IEEESTD_LOGIC_1164ALL

                                                use IEEESTD_LOGIC_ARITHALL

                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                entity sync_unit is port(

                                                clkin std_logic

                                                rstnin std_logic

                                                hsyncout std_logic ---horizontal synch pulse(row)

                                                vsyncout std_logic --vertical sync

                                                pulse(frame) 111110100000000000

                                                v_en_regout std_logic

                                                h_enout std_logic

                                                v_enout std_logic

                                                line_countout integer

                                                c_hsout integer )

                                                end sync_unit

                                                architecture arch_sync_unit of sync_unit is

                                                signal h_en_sv_en_sstd_logic

                                                signal line_count_sc_hs_sinteger

                                                begin

                                                c_hs lt= c_hs_s

                                                line_count lt= line_count_s

                                                h_en lt= h_en_s

                                                v_en lt= v_en_s

                                                -- hsync counter

                                                process(clkrstn)

                                                begin

                                                if(rstn = 0)then

                                                46

                                                c_hs_s lt= 0

                                                else

                                                if(clk = 1 and clkevent)then

                                                if(c_hs_s=793)then

                                                c_hs_s lt= 0

                                                else

                                                c_hs_s lt= c_hs_s +1

                                                end if

                                                end if

                                                end if

                                                end process

                                                --vsync counter

                                                --vertical line counter

                                                process(h_en_srstn)

                                                begin

                                                if(rstn = 0)then

                                                line_count_s lt= 0

                                                elsif(h_en_s = 1 and h_en_sevent)then

                                                line_count_s lt= line_count_s + 1

                                                if (line_count_s=515)then

                                                line_count_s lt= 0

                                                end if

                                                end if

                                                --end if

                                                end process

                                                --hysnc pulse

                                                process(clkrstn)

                                                begin

                                                if(rstn = 0)then

                                                47

                                                hsync lt= 1

                                                elsif(clk = 1 and clkevent)then

                                                if(c_hs_s lt=95)then

                                                hsync lt= 0

                                                else

                                                hsync lt= 1

                                                end if

                                                end if

                                                --end if

                                                end process

                                                process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                this period)

                                                begin

                                                if(rstn = 0)then

                                                h_en_s lt= 0

                                                elsif(clk = 1 and clkevent)then

                                                if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                h_en_s lt= 1

                                                else

                                                h_en_s lt= 0

                                                end if

                                                end if

                                                --end if

                                                end process

                                                ---vysnc pulse

                                                process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                begin

                                                if(rstn = 0)then

                                                48

                                                v_en_s lt= 0

                                                elsif(clk = 1 and clkevent)then

                                                if((line_count_sgt=35 and line_count_slt=515))then

                                                v_en_s lt= 1

                                                else

                                                v_en_s lt= 0

                                                end if

                                                end if

                                                --end if

                                                end process

                                                process(clkrstn) --Vertical Sync Front porch Back Porch

                                                begin

                                                if(rstn = 0)then

                                                vsync lt= 1

                                                elsif(clk = 1 and clkevent)then

                                                if(line_count_s lt2)then

                                                vsync lt= 0

                                                else

                                                vsync lt= 1

                                                end if

                                                end if

                                                --end if

                                                end process

                                                process(clk)

                                                begin

                                                if(clk = 1 and clkevent)then

                                                v_en_reglt=v_en_s

                                                end if

                                                end process

                                                end arch_sync_unit

                                                49

                                                PIXEL GENERATOR

                                                library IEEE

                                                use IEEESTD_LOGIC_1164ALL

                                                use IEEESTD_LOGIC_ARITHALL

                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                entity pixelgen is port(

                                                clkin std_logic

                                                rstnin std_logic

                                                hsyncin std_logic

                                                vsyncin std_logic

                                                h_enin std_logic

                                                v_enin std_logic

                                                v_en_reg in std_logic

                                                data_inin std_logic_vector(0 downto 0)

                                                line_countin integer

                                                c_hsin integer

                                                red out std_logic_vector(0 downto 0)

                                                greenout std_logic_vector(0 downto 0)

                                                blueout std_logic_vector(0 downto 0)

                                                addrout std_logic_vector(17 downto 0)

                                                rdout std_logic

                                                )

                                                end pixelgen

                                                architecture behv of pixelgen is

                                                signal addr_sstd_logic_vector(17 downto 0)

                                                begin

                                                addr lt= addr_s

                                                process(clkrstn)

                                                begin

                                                50

                                                if(rstn = 0)then

                                                rd lt= 0

                                                red lt= 0

                                                green lt= 0

                                                blue lt= 0

                                                addr_s lt= x0000 amp 00

                                                elsif (clk = 1 and clkevent)then

                                                if((v_en=1) and (v_en_reg=0))then

                                                addr_slt=(others =gt0)

                                                end if

                                                if(v_en=1 and h_en=1)then

                                                if(line_countlt=514)then

                                                if(c_hs lt= 655)then

                                                rd lt= 1

                                                red lt= data_in

                                                green lt= data_in

                                                blue lt= data_in

                                                addr_s lt=addr_s +000000000000000001

                                                else

                                                rd lt= 0

                                                red lt= 0

                                                green lt= 1

                                                blue lt= 0

                                                end if

                                                else

                                                rd lt= 0

                                                red lt= 1

                                                green lt= 1

                                                blue lt= 1

                                                51

                                                end if

                                                if(addr_s=111011111111111111)then

                                                addr_slt= (others =gt0)

                                                end if

                                                else

                                                red lt= 0

                                                green lt= 0

                                                blue lt= 0

                                                end if

                                                end if

                                                -- end if

                                                --end if

                                                end process

                                                end behv

                                                52

                                                12 FPGA ARCHITECTURE

                                                A field-programmable gate array (FPGA) is a semiconductor device that can

                                                be configured by the customer or designer after manufacturingmdashhence the

                                                name field-programmable FPGAs are programmed using a logic circuit

                                                diagram or a source code in a hardware description language (HDL) to specify

                                                how the chip will work They can be used to implement any logical function

                                                that an application-specific integrated circuit (ASIC) could perform but the

                                                ability to update the functionality after shipping offers advantages for many

                                                applications

                                                FPGAs contain programmable logic components called logic blocks and a

                                                hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                can be configured to perform complex combinational functions or merely

                                                simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                include memory elements which may be simple flip-flops or more complete

                                                blocks of memory

                                                History The FPGA industry sprouted from programmable read only

                                                memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                both had the option of being programmed in batches in a factory or in the field

                                                (field programmable) however programmable logic was hard-wired between

                                                logic gates

                                                Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                first commercially viable field programmable gate array in 1985 ndash the

                                                XC2064 The XC2064 had programmable gates and programmable

                                                interconnects between gates the beginnings of a new technology and market

                                                The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                into the National Inventors Hall of Fame for his invention

                                                53

                                                Some of the industryrsquos foundational concepts and technologies for

                                                programmable logic arrays gates and logic blocks are founded in patents

                                                awarded to David W Page and LuVerne R Peterson in 1985

                                                In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                proposed by Steve Casselman to develop a computer that would implement

                                                600000 reprogrammable gates Casselman was successful and the system was

                                                awarded a patent in 1992

                                                Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                1990s when competitors sprouted up eroding significant market-share By

                                                1993 Actel was serving about 18 percent of the market

                                                The 1990s were an explosive period of time for FPGAs both in sophistication

                                                and the volume of production In the early 1990s FPGAs were primarily used

                                                in telecommunications and networking By the end of the decade FPGAs

                                                found their way into consumer automotive and industrial applications

                                                FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                algorithm technology and FPGAs to create a sound recognition device

                                                Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                to decide the configuration needed to accomplish a sound recognition task

                                                121 Modern developments

                                                A recent trend has been to take the coarse-grained architectural

                                                approach a step further by combining the logic blocks and interconnects of

                                                traditional FPGAs with embedded microprocessors and related peripherals to

                                                form a complete system on a programmable chip This work mirrors the

                                                architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                Group which combined a reconfigurable CPU architecture on a single chip

                                                called the SB24 That work was done in 1982 Examples of such hybrid

                                                technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                which include one or more PowerPC processors embedded within the FPGAs

                                                54

                                                logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                processor in combination with Atmels programmable logic architecture

                                                An alternate approach to using hard-macro processors is to make use of soft

                                                processor cores that are implemented within the FPGA logic (See Soft

                                                processors below)

                                                As previously mentioned many modern FPGAs have the ability to be

                                                reprogrammed at run time and this is leading to the idea of reconfigurable

                                                computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                example of a reconfigurable soft processor implemented on FPGAs However

                                                it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                to a specific program

                                                Additionally new non-FPGA architectures are beginning to emerge

                                                Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                hybrid approach by providing an array of processor cores and FPGA-like

                                                programmable cores on the same chip

                                                Gates

                                                1987 9000 gates Xilinx

                                                1992 600000 Naval Surface Warfare Department

                                                Early 2000s Millions

                                                Market size

                                                1985 First commercial FPGA technology invented by Xilinx

                                                1987 $14 million

                                                ~1993 gt$385 million

                                                2005 $19 billion

                                                2010 estimates $275 billion

                                                55

                                                122 FPGA Comparisons

                                                Historically FPGAs have been slower less energy efficient and generally

                                                achieved less functionality than their fixed ASIC counterparts A combination

                                                of volume fabrication improvements research and development and the IO

                                                capabilities of new supercomputers have largely closed the performance gap

                                                between ASICs and FPGAs

                                                Advantages include a shorter time to market ability to re-program in the field

                                                to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                middle road by developing their hardware on ordinary FPGAs but

                                                manufacture their final version so it can no longer be modified after the design

                                                has been committed

                                                Xilinx claims that several market and technology dynamics are changing the

                                                ASICFPGA paradigm

                                                IC costs are rising aggressively

                                                ASIC complexity has bolstered development time and costs

                                                RampD resources and headcount is decreasing

                                                Revenue losses for slow time-to-market are increasing

                                                Financial constraints in a poor economy are driving low-cost technologies

                                                These trends make FPGAs a better alternative than ASICs for a growing

                                                number of higher-volume applications than they have been historically used

                                                for which the company blames for the growing number of FPGA design starts

                                                (see History)

                                                The primary differences between CPLDs and FPGAs are architectural A

                                                CPLD has a somewhat restrictive structure consisting of one or more

                                                programmable sum-of-products logic arrays feeding a relatively small number

                                                of clocked registers The result of this is less flexibility with the advantage of

                                                more predictable timing delays and a higher logic-to-interconnect ratio The

                                                FPGA architectures on the other hand are dominated by interconnect This

                                                56

                                                makes them far more flexible (in terms of the range of designs that are

                                                practical for implementation within them) but also far more complex to design

                                                for

                                                Another notable difference between CPLDs and FPGAs is the presence in most

                                                FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                and embedded memories as well as to have logic blocks implement decoders

                                                or mathematical functions

                                                Some FPGAs have the capability of partial re-configuration that lets one

                                                portion of the device be re-programmed while other portions continue running

                                                123 Applications

                                                Applications of FPGAs include digital signal processing software-defined

                                                radio aerospace and defense systems ASIC prototyping medical imaging

                                                computer vision speech recognition cryptography bioinformatics computer

                                                hardware emulation radio astronomy and a growing range of other areas

                                                FPGAs originally began as competitors to CPLDs and competed in a similar

                                                space that of glue logic for PCBs As their size capabilities and speed

                                                increased they began to take over larger and larger functions to the state where

                                                some are now marketed as full systems on chips (SoC) Particularly with the

                                                introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                applications which had traditionally been the sole reserve of DSPs began to

                                                incorporate FPGAs instead

                                                FPGAs especially find applications in any area or algorithm that can make use

                                                of the massive parallelism offered by their architecture One such area is code

                                                breaking in particular brute-force attack of cryptographic algorithms

                                                FPGAs are increasingly used in conventional high performance computing

                                                applications where computational kernels such as FFT or Convolution are

                                                performed on the FPGA instead of a microprocessor

                                                57

                                                The inherent parallelism of the logic resources on an FPGA allows for

                                                considerable computational throughput even at a low MHz clock rates The

                                                flexibility of the FPGA allows for even higher performance by trading off

                                                precision and range in the number format for an increased number of parallel

                                                arithmetic units This has driven a new type of processing called

                                                reconfigurable computing where time intensive tasks are offloaded from

                                                software to FPGAs

                                                The adoption of FPGAs in high performance computing is currently limited by

                                                the complexity of FPGA design compared to conventional software and the

                                                extremely long turn-around times of current design tools where 4-8 hours wait

                                                is necessary after even minor changes to the source code

                                                Traditionally FPGAs have been reserved for specific vertical applications

                                                where the volume of production is small For these low-volume applications

                                                the premium that companies pay in hardware costs per unit for a

                                                programmable chip is more affordable than the development resources spent

                                                on creating an ASIC for a low-volume application Today new cost and

                                                performance dynamics have broadened the range of viable applications

                                                Architecture

                                                The most common FPGA architecture consists of an array of configurable

                                                logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                channels have the same width (number of wires) Multiple IO pads may fit

                                                into the height of one row or the width of one column in the array

                                                An application circuit must be mapped into an FPGA with adequate resources

                                                While the number of CLBs and IOs required is easily determined from the

                                                design the number of routing tracks needed may vary considerably even

                                                among designs with the same amount of logic (For example a crossbar switch

                                                requires much more routing than a systolic array with the same gate count)

                                                Since unused routing tracks increase the cost (and decrease the performance)

                                                of the part without providing any benefit FPGA manufacturers try to provide

                                                58

                                                just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                can be routed This is determined by estimates such as those derived from

                                                Rents rule or by experiments with existing designs

                                                The FPGA is an array or island-style FPGA It consists of an array of logic

                                                blocks and routing channels Two IO pads fit into the height of one row or the

                                                width of one column as shown below All the routing channels have the same

                                                width (number of wires)

                                                121 FPGA structure

                                                A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                flip-flop as shown below In recent years manufacturers have started moving

                                                to 6-input LUTs in their high performance parts claiming increased

                                                performance

                                                Typical logic block

                                                59

                                                There is only one output which can be either the registered or the unregistered

                                                LUT output The logic block has four inputs for the LUT and a clock input

                                                Since clock signals (and often other high-fanout signals) are normally routed

                                                via special-purpose dedicated routing networks in commercial FPGAs they

                                                and other signals are separately managed

                                                For this example architecture the locations of the FPGA logic block pins are

                                                shown below

                                                Logic Block Pin Locations

                                                Each input is accessible from one side of the logic block while the output pin

                                                can connect to routing wires in both the channel to the right and the channel

                                                below the logic block

                                                Each logic block output pin can connect to any of the wiring segments in the

                                                channels adjacent to it

                                                Similarly an IO pad can connect to any one of the wiring segments in the

                                                channel adjacent to it For example an IO pad at the top of the chip can

                                                connect to any of the W wires (where W is the channel width) in the horizontal

                                                channel immediately below it

                                                Generally the FPGA routing is unsegmented That is each wiring segment

                                                spans only one logic block before it terminates in a switch box By turning on

                                                60

                                                some of the programmable switches within a switch box longer paths can be

                                                constructed For higher speed interconnect some FPGA architectures use

                                                longer routing lines that span multiple logic blocks

                                                Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                this architecture when a wire enters a switch box there are three

                                                programmable switches that allow it to connect to three other wires in adjacent

                                                channel segments The pattern or topology of switches used in this

                                                architecture is the planar or domain-based switch box topology In this switch

                                                box topology a wire in track number one connects only to wires in track

                                                number one in adjacent channel segments wires in track number 2 connect

                                                only to other wires in track number 2 and so on The figure below illustrates

                                                the connections in a switch box

                                                Switch box topology

                                                Modern FPGA families expand upon the above capabilities to include higher

                                                level functionality fixed into the silicon Having these common functions

                                                embedded into the silicon reduces the area required and gives those functions

                                                increased speed compared to building them from primitives Examples of these

                                                include multipliers generic DSP blocks embedded processors high speed IO

                                                logic and embedded memories

                                                61

                                                FPGAs are also widely used for systems validation including pre-silicon

                                                validation post-silicon validation and firmware development This allows chip

                                                companies to validate their design before the chip is produced in the factory

                                                reducing the time to market

                                                124 FPGA Design and Programming

                                                To define the behavior of the FPGA the user provides a hardware description

                                                language (HDL) or a schematic design The HDL form might be easier to work

                                                with when handling large structures because its possible to just specify them

                                                numerically rather than having to draw every piece by hand On the other hand

                                                schematic entry can allow for easier visualisation of a design

                                                Then using an electronic design automation tool a technology-mapped netlist

                                                is generated The netlist can then be fitted to the actual FPGA architecture

                                                using a process called place-and-route usually performed by the FPGA

                                                companys proprietary place-and-route software The user will validate the

                                                map place and route results via timing analysis simulation and other

                                                verification methodologies Once the design and validation process is

                                                complete the binary file generated (also using the FPGA companys

                                                proprietary software) is used to (re)configure the FPGA

                                                Going from schematicHDL source files to actual configuration The source

                                                files are fed to a software suite from the FPGACPLD vendor that through

                                                different steps will produce a file This file is then transferred to the

                                                FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                like an EEPROM

                                                The most common HDLs are VHDL and Verilog although in an attempt to

                                                reduce the complexity of designing in HDLs which have been compared to the

                                                equivalent of assembly languages there are moves to raise the abstraction level

                                                through the introduction of alternative languages

                                                62

                                                To simplify the design of complex systems in FPGAs there exist libraries of

                                                predefined complex functions and circuits that have been tested and optimized

                                                to speed up the design process These predefined circuits are commonly called

                                                IP cores and are available from FPGA vendors and third-party IP suppliers

                                                (rarely free and typically released under proprietary licenses) Other

                                                predefined circuits are available from developer communities such as Open

                                                Cores (typically free and released under the GPL BSD or similar license) and

                                                other sources

                                                In a typical design flow an FPGA application developer will simulate the

                                                design at multiple stages throughout the design process Initially the RTL

                                                description in VHDL or Verilog is simulated by creating test benches to

                                                simulate the system and observe results Then after the synthesis engine has

                                                mapped the design to a netlist the netlist is translated to a gate level

                                                description where simulation is repeated to confirm the synthesis proceeded

                                                without errors Finally the design is laid out in the FPGA at which point

                                                propagation delays can be added and the simulation run again with these

                                                values back-annotated onto the netlist

                                                Basic Process Technology Types

                                                SRAM - based on static memory technology In-system programmable and re-

                                                programmable Requires external boot devices CMOS

                                                Antifuse - One-time programmable CMOS

                                                EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                one-time programmable in production because of plastic packaging

                                                Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                technology Can be erased even in plastic packages Some but not all

                                                EEPROM devices can be in-system programmed CMOS

                                                Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                packages Some but not all flash devices can be in-system programmed

                                                63

                                                Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                therefore less expensive to manufacture CMOS

                                                Fuse - One-time programmable Bipolar

                                                Major Manufacturers

                                                Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                rivals Together they control over 80 percent of the market with Xilinx alone

                                                representing over 50 percent

                                                Xilinx also provides free Windows and Linux design software while Altera

                                                provides free Windows tools the Solaris and Linux tools are only available via

                                                a rental scheme

                                                Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                focused CSSP no general purpose FPGAs)

                                                125 FPGA prototype

                                                FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                hardware verification and early software development

                                                Main stream verification methods for hardware design and early software and

                                                firmware co-design has become mainstream Prototyping SoC and ASIC

                                                design on FPGA has become a good method to do this

                                                64

                                                Reasons why Prototyping is important

                                                1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                functionally correct This is compared to designers only relying on software

                                                simulations to verify that their hardware design is sound Simulation speed and

                                                modeling accuracy limitations hinder this development

                                                2 Due to time constrains many projects cannot wait until the silicon is back from

                                                the foundry to start on software tests FPGA prototyping allows for much more

                                                time in area of software development and testing at the software-hardware

                                                integration stage This allows many unforeseen software bugs that appear due

                                                to todays array of operating systems applications and hardware

                                                3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                system work well together off the simulation stage and in actual form

                                                4 Prototyping has the added advantage as demo platforms to SoC clients

                                                bringing in interest early This speeds up the overall development cycle and

                                                allows for more enhancement or improvement to the chip features as it would

                                                otherwise have been

                                                65

                                                13 SIMULATION RESULTS

                                                66

                                                GATE LEVEL

                                                FigVGA CONTROLLER

                                                67

                                                GATE LEVEL

                                                Fig SYNC UNIT

                                                68

                                                TECHNOLOGY SCHEMATIC

                                                69

                                                14 APPLICATIONS

                                                Motion Tracking Systems

                                                Image processing systems

                                                Displaying Systems

                                                141 CONCLUTION

                                                Hence by implementing FPGA taking a processor-generated picture

                                                (pixilated) from memory space the Controller provides digital RGB values for

                                                each pixel as well as horizontal and vertical synchronization signals in order

                                                to correctly display the picture on a connected monitor

                                                70

                                                REFERENCES

                                                1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                2 A VGA display controller by Eduardo Sanchez

                                                3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                71

                                                • Pixel
                                                  • 6 SIGNALS IN VGA CONTROLLER
                                                  • Gates
                                                  • Market size
                                                  • 122 FPGA Comparisons

                                                  64 Colors mode

                                                  The next byte of data will be loaded into the register on the rising edge

                                                  of the external system clock signal (CLK) In this mode the read of pixel data

                                                  does not depend on the status of the hvcnt signal received from the

                                                  Synchronization Unit

                                                  93 The RGB register (RGBREG)

                                                  The RGB register is used to store the six bits that are required for

                                                  driving the red green and blue color guns of the target monitor When the

                                                  chosen color palette is either Black amp White or 16 Colors these six bits are

                                                  obtained by mapping the value of the active pixel to a predefined RGB code

                                                  When the chosen color palette is 64 Colors the actual pixel value is used

                                                  directly

                                                  The register RGBREG is reset to zero (000000) when the VGA

                                                  Controller receives an external reset signal (RST) This RGB code represents

                                                  black

                                                  The register is updated on each rising edge of the CLK signal and the

                                                  value loaded is dependent on the state of the line display enable signal en

                                                  When en is Low blanking is required and RGBREG is loaded with the code

                                                  for black (000000)

                                                  Table 2 illustrates the mapping of a 1-bit pixel (Black amp White mode)

                                                  and a 4-bit pixel (16 Colors mode) into the required RGB color code

                                                  25

                                                  26

                                                  27

                                                  The RGB color code stored in the RGB register is output from the VGA

                                                  Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                                  and B1)

                                                  The monitor itself expects analog signals as inputs to its electron gun control

                                                  circuits This is achieved by using 2-bit digital to analog converter circuitry

                                                  located on the Nano Board itself as shown in Figure 4

                                                  Figure 91 Digital to Analog RGB conversion

                                                  28

                                                  For each color the 2-bit digital signal from the VGA Controller can be

                                                  converted into 4 distinct analog levels These levels specify the intensity of

                                                  each of the three primary colors to use when displaying the pixel on the

                                                  monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                                  brightness)

                                                  With each analog input being one of four possible levels the monitor

                                                  can display each pixel on the screen with one of 64 different color

                                                  permutations

                                                  Using Block RAM in Spartan-3E Generation

                                                  FPGAs

                                                  For applications requiring large on-chip memories Spartantrade-3

                                                  Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                                  Using various configuration options Select RAM blocks create RAM ROM

                                                  FIFOs large look-up tables data width converterscircular buffers and shift

                                                  registers each supporting various data widths and depths This application note

                                                  describes the features and capabilities of block Select RAM and illustrates how

                                                  to specify the various options using the Xilinx CORE Generatortrade system or

                                                  via VHDL or Verilog instantiation

                                                  29

                                                  Table 2 Block RAM Available in Spartan-3E Devices

                                                  Devi

                                                  ce

                                                  R

                                                  A

                                                  M

                                                  Co

                                                  lu

                                                  m

                                                  ns

                                                  R

                                                  A

                                                  M

                                                  Bl

                                                  oc

                                                  ks

                                                  pe

                                                  r

                                                  C

                                                  ol

                                                  u

                                                  m

                                                  n

                                                  T

                                                  o

                                                  t

                                                  a

                                                  l

                                                  R

                                                  A

                                                  M

                                                  B

                                                  l

                                                  o

                                                  c

                                                  k

                                                  s

                                                  T

                                                  o

                                                  t

                                                  a

                                                  l

                                                  R

                                                  A

                                                  M

                                                  B

                                                  it

                                                  s

                                                  T

                                                  o

                                                  t

                                                  a

                                                  l

                                                  R

                                                  A

                                                  M

                                                  K

                                                  b

                                                  i

                                                  t

                                                  s

                                                  XC3

                                                  S100

                                                  E

                                                  1 4 4 7

                                                  3

                                                  7

                                                  2

                                                  8

                                                  7

                                                  2

                                                  XC3

                                                  S250

                                                  E

                                                  2 6 1

                                                  2

                                                  2

                                                  2

                                                  1

                                                  1

                                                  8

                                                  4

                                                  2

                                                  1

                                                  6

                                                  XC3

                                                  S500

                                                  E

                                                  2 10 2

                                                  0

                                                  3

                                                  6

                                                  8

                                                  6

                                                  4

                                                  3

                                                  6

                                                  0

                                                  30

                                                  0

                                                  XC3

                                                  S120

                                                  0E

                                                  2 14 2

                                                  8

                                                  5

                                                  1

                                                  6

                                                  0

                                                  9

                                                  6

                                                  5

                                                  0

                                                  4

                                                  XC3

                                                  S160

                                                  0E

                                                  2 18 3

                                                  6

                                                  6

                                                  6

                                                  3

                                                  5

                                                  5

                                                  2

                                                  6

                                                  4

                                                  8

                                                  Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                                  which is allocated to data storage and in some memory configurations an

                                                  additional 2K bits allocated to parity or additional plus data bits Physically

                                                  the block RAM memory has two completely independent access ports labeled

                                                  Port A and Port B The structure is fully symmetrical and both ports are

                                                  interchangeable and both ports support data read and write operations Each

                                                  memory port is synchronous with its own clock clock enable and write

                                                  enable Read operations are also synchronous and require a clock edge and

                                                  clock enable Though physically a dual-port memory block RAM simulates

                                                  single-port memory in an application as shown in Figure 1 Furthermore each

                                                  block memory supports multiple configurations or aspect ratios Table 3

                                                  summarizes the essential SelectRAM features Cascade multiple block RAMs

                                                  to create deeper and wider memory organizations with a minimal timing

                                                  penalty incurred through specialized routing resources

                                                  31

                                                  32

                                                  10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                  Why (V) HDL

                                                  Interoperability

                                                  Technology independence

                                                  Design reuse

                                                  Several levels of abstraction

                                                  Readability

                                                  Standard language

                                                  Widely supported

                                                  What is VHDL

                                                  VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                  IC)

                                                  Design specification language

                                                  Design entry language

                                                  Design simulation language

                                                  Design documentation language

                                                  An alternative to schematics

                                                  101 Brief History

                                                  VHDL Was developed in the early 1980s for managing design problems that

                                                  involved large circuits and multiple teams of engineers

                                                  Funded by US Department of Defence

                                                  33

                                                  The first publicly available version was released in 1985

                                                  In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                  presented with a proposal to standardize the VHDL

                                                  In 1987 standardization =gt IEEE 1076-1987

                                                  An improved version of the language was released in 1994 =gt IEEE

                                                  standard1076-1993

                                                  102 Related Standards

                                                  IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                  impedance

                                                  Soon after IEEE 1076-1987 was released simulator companies began using

                                                  their own non-standard types =gt VHDL was becoming a nonstandard

                                                  IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                  definitions for a nine-valued data type std_logic

                                                  IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                  to actual hardware

                                                  Defines eg two numeric types signed and unsigned

                                                  VHDL Environment

                                                  Design Units

                                                  34

                                                  Segments of VHDL code that can be compiled separately and stored in a

                                                  library

                                                  Entities

                                                  A black box with interface definition

                                                  Defines the inputsoutputs of a component (define pins)

                                                  A way to represent modularity in VHDL

                                                  Similar to symbol in schematic

                                                  Entity declaration describes entity

                                                  Eg

                                                  entity Comparator is

                                                  port (A B in std_logic_vector(7 downto0)

                                                  EQ out std_logic)

                                                  end Comparator

                                                  Ports Provide channels of communication between the component and its

                                                  environment

                                                  Each port must have a name direction and a type

                                                  An entity may have NO port declaration

                                                  35

                                                  Port directions In A value of a port can be read inside the component but cannot be assigned

                                                  Multiple reads of port are allowed

                                                  Out Assignments can be made to a port but data from a port cannot be read

                                                  Multiple assignments are allowed

                                                  In out Bi-directional assignments can be made and data can be read Multiple

                                                  assignments are allowed

                                                  Buffer An out port with read capability May have at most one assignment

                                                  (are not recommended)

                                                  Architectures Every entity has at least one architecture

                                                  One entity can have several architectures

                                                  Architectures can describe design using

                                                  BehaviorndashStructurendashDataflow

                                                  Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                  Transfer Level)ndashBehavioral level

                                                  Configuration declaration links architecture to entity

                                                  Eg

                                                  Architecture Comparator1 of Comparator is

                                                  Begin

                                                  EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                  End Comparator1

                                                  Configurations Links entity declaration and architecture body together

                                                  Concept of default configuration is a bit messy in VHDL lsquo87

                                                  ndashLast architecture analyzed links to entity

                                                  Can be used to change simulation behavior without re-analyzing the VHDL

                                                  source

                                                  Complex configuration declarations are ignored in synthesis

                                                  36

                                                  Some entities can have eggate level architecture and behavioral

                                                  architecture

                                                  Are always optional

                                                  PackagesPackages contain information common to many design units

                                                  1 Package declaration

                                                  --constant declarations

                                                  ndashtype and subtype declarations

                                                  ndashfunction and procedure declarations

                                                  ndashglobal signal declarations

                                                  ndashfile declarations

                                                  ndashcomponent declarations

                                                  2 Package body

                                                  ndashis not necessary needed

                                                  ndashfunction bodies

                                                  ndashprocedure bodies

                                                  Packages are meant for encapsuling data which can be shared globally among

                                                  Several design units

                                                  Consists of declaration part and optional body part

                                                  Package declaration can contain

                                                  ndashtype and subtype declarations

                                                  ndashsubprograms

                                                  ndashconstants

                                                  Alias declarations

                                                  ndashglobal signal declarations

                                                  ndashfile declarations

                                                  ndashcomponent declarations

                                                  37

                                                  Package body consists of

                                                  ndashsubprogram declarations and bodies

                                                  ndashtype and subtype declarations

                                                  ndash deferred constants

                                                  ndash file declarations

                                                  LibrariesCollection of VHDL design units (database)

                                                  1 Packages

                                                  package declaration

                                                  package body

                                                  2 Entities (entity declaration)

                                                  3 Architectures (architecture body)

                                                  4 Configurations (configuration declarations)

                                                  Usually directory in UNIX file system

                                                  Can be also any other kind of database

                                                  Levels of Abstraction

                                                  VHDL supports many possible styles of design description which differ

                                                  primarily in how closely they relate to the HW

                                                  It is possible to describe a circuit in a number of ways

                                                  Structural-------

                                                  Dataflow ------- Higher level of abstraction

                                                  Behavioral -------

                                                  Structural VHDL description

                                                  Circuit is described in terms of its components

                                                  From a low-level description (eg transistor-level description) to a high level

                                                  description (eg block diagram)

                                                  For large circuits a low-level description quickly becomes impractical

                                                  103 Dataflow VHDL Description

                                                  38

                                                  Circuit is described in terms of how data moves through the system

                                                  In the dataflow style you describe how information flows between registers in

                                                  the system

                                                  The combinational logic is described at a relatively high level the placement

                                                  and operation of registers is specified quite precisely

                                                  The behavior of the system over the time is defined by registers

                                                  There are no build-in registers in VHDL-language

                                                  ndashEither lower level description

                                                  ndashor behavioral description of sequential elements is needed

                                                  The lower level register descriptions must be created or obtained

                                                  If there is no 3rd party models for registers =gt you must write the behavioral

                                                  description of registers

                                                  The behavioral description can be provided in the form of

                                                  subprograms(functions or procedures)

                                                  104 Behavioral VHDL Description

                                                  Circuit is described in terms of its operation over time

                                                  39

                                                  Representation might include eg state diagrams timing diagrams and

                                                  algorithmic descriptions

                                                  The concept of time may be expressed precisely using delays (eg A lt= B

                                                  after 10 ns)

                                                  If no actual delay is used order of sequential operations is defined

                                                  In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                  timing specifications

                                                  The actual timing results depend on implementation technology and efficiency

                                                  of synthesis tool

                                                  There are a few tools for behavioral synthesis

                                                  Concurrent Vs Sequential

                                                  Processes

                                                  Basic simulation concept in VHDL

                                                  VHDL description can always be broken up to interconnected processes

                                                  Quite similar to Unix process

                                                  40

                                                  Process keyword in VHDL

                                                  Process statement is concurrent statement

                                                  Statements inside process statements are sequential statements

                                                  Process must contain either sensitivity list or wait statement(s) but NOT both

                                                  Sensitivity list or wait statement(s) contains signals which wakes process up

                                                  General format

                                                  Process [(sensitivity_list)]

                                                  process_declarative_part

                                                  Begin

                                                  process_statements

                                                  [wait_statement]

                                                  End process

                                                  41

                                                  11 VGA CONTROLLER CODE

                                                  library IEEE

                                                  use IEEESTD_LOGIC_1164ALL

                                                  use IEEESTD_LOGIC_ARITHALL

                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                  use IEEEnumeric_stdALL

                                                  ---- Uncomment the following library declaration if instantiating

                                                  ---- any Xilinx primitives in this code

                                                  --library UNISIM

                                                  --use UNISIMVComponentsall

                                                  entity vga_controller is port(clk50min std_logic

                                                  rstnin std_logic

                                                  hsyncout std_logic

                                                  vsyncout std_logic

                                                  redout std_logic_vector(0 downto 0)

                                                  greenout std_logic_vector(0 downto 0)

                                                  blueout std_logic_vector(0 downto 0))

                                                  end vga_controller

                                                  architecture Behavioral of vga_controller is

                                                  component memory

                                                  port (

                                                  addr IN std_logic_VECTOR(17 downto 0)

                                                  clk IN std_logic

                                                  dout OUT std_logic_VECTOR(0 downto 0)

                                                  en IN std_logic

                                                  sinit IN std_logic)

                                                  END component

                                                  42

                                                  component sync_unit port(

                                                  clkin std_logic

                                                  rstnin std_logic

                                                  hsyncout std_logic

                                                  ---horizontal synch pulse(row)

                                                  vsyncout std_logic

                                                  --vertical sync pulse(frame) 111110100000000000

                                                  v_en_regout std_logic

                                                  h_enout std_logic

                                                  v_enout std_logic

                                                  line_countinout integer

                                                  c_hsinout integer)

                                                  end component

                                                  component pixelgen port(

                                                  clkin std_logic

                                                  rstnin std_logic

                                                  hsyncin std_logic

                                                  vsyncin std_logic

                                                  h_enin std_logic

                                                  v_enin std_logic

                                                  v_en_reg in std_logic

                                                  data_inin std_logic_vector(0 downto 0)

                                                  line_countin integer

                                                  c_hsin integer

                                                  red out std_logic_vector(0 downto 0)

                                                  greenout std_logic_vector(0 downto 0)

                                                  blueout std_logic_vector(0 downto 0)

                                                  addrout std_logic_vector(17 downto 0)

                                                  rdout std_logic)

                                                  end component

                                                  43

                                                  signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                  signal line_count_sc_hs_saddrinteger

                                                  signal data_sstd_logic_vector(0 downto 0)

                                                  signal addr_sstd_logic_vector(17 downto 0)

                                                  begin

                                                  --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                  process(clk50mrstn)

                                                  begin

                                                  if(rstn = 0)then

                                                  clk lt= 0

                                                  elsif(clk50m = 1 and clk50mevent)then

                                                  clk lt= not clk

                                                  end if

                                                  --end if

                                                  end process

                                                  hsync lt= hsync_s

                                                  vsync lt= vsync_s

                                                  sync sync_unit port map(

                                                  clk =gt clk

                                                  rstn =gt rstn

                                                  hsync =gt hsync_s

                                                  vsync =gt vsync_s

                                                  v_en_reg =gt v_en_reg_s

                                                  line_count =gt line_count_s

                                                  h_en =gt h_en_s

                                                  v_en =gt v_en_s

                                                  c_hs =gt c_hs_s

                                                  )

                                                  44

                                                  pixel pixelgen port map(

                                                  clk =gt clk

                                                  rstn =gt rstn

                                                  hsync =gt hsync_s

                                                  vsync =gt vsync_s

                                                  h_en =gt h_en_s

                                                  v_en =gt v_en_s

                                                  v_en_reg =gt v_en_reg_s

                                                  data_in =gt data_s

                                                  line_count=gt line_count_s

                                                  c_hs =gt c_hs_s

                                                  red =gt red

                                                  green =gt green

                                                  blue =gt blue

                                                  addr =gt addr_s

                                                  rd =gt rd_s

                                                  )

                                                  mem12 memory port map(

                                                  addr =gt addr_s

                                                  clk =gt clk

                                                  dout =gt data_s

                                                  en =gt rd_s

                                                  sinit=gt rstn

                                                  )

                                                  end Behavioral

                                                  45

                                                  SYNCHRONIZATION BLOCK CODE

                                                  library IEEE

                                                  use IEEESTD_LOGIC_1164ALL

                                                  use IEEESTD_LOGIC_ARITHALL

                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                  entity sync_unit is port(

                                                  clkin std_logic

                                                  rstnin std_logic

                                                  hsyncout std_logic ---horizontal synch pulse(row)

                                                  vsyncout std_logic --vertical sync

                                                  pulse(frame) 111110100000000000

                                                  v_en_regout std_logic

                                                  h_enout std_logic

                                                  v_enout std_logic

                                                  line_countout integer

                                                  c_hsout integer )

                                                  end sync_unit

                                                  architecture arch_sync_unit of sync_unit is

                                                  signal h_en_sv_en_sstd_logic

                                                  signal line_count_sc_hs_sinteger

                                                  begin

                                                  c_hs lt= c_hs_s

                                                  line_count lt= line_count_s

                                                  h_en lt= h_en_s

                                                  v_en lt= v_en_s

                                                  -- hsync counter

                                                  process(clkrstn)

                                                  begin

                                                  if(rstn = 0)then

                                                  46

                                                  c_hs_s lt= 0

                                                  else

                                                  if(clk = 1 and clkevent)then

                                                  if(c_hs_s=793)then

                                                  c_hs_s lt= 0

                                                  else

                                                  c_hs_s lt= c_hs_s +1

                                                  end if

                                                  end if

                                                  end if

                                                  end process

                                                  --vsync counter

                                                  --vertical line counter

                                                  process(h_en_srstn)

                                                  begin

                                                  if(rstn = 0)then

                                                  line_count_s lt= 0

                                                  elsif(h_en_s = 1 and h_en_sevent)then

                                                  line_count_s lt= line_count_s + 1

                                                  if (line_count_s=515)then

                                                  line_count_s lt= 0

                                                  end if

                                                  end if

                                                  --end if

                                                  end process

                                                  --hysnc pulse

                                                  process(clkrstn)

                                                  begin

                                                  if(rstn = 0)then

                                                  47

                                                  hsync lt= 1

                                                  elsif(clk = 1 and clkevent)then

                                                  if(c_hs_s lt=95)then

                                                  hsync lt= 0

                                                  else

                                                  hsync lt= 1

                                                  end if

                                                  end if

                                                  --end if

                                                  end process

                                                  process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                  this period)

                                                  begin

                                                  if(rstn = 0)then

                                                  h_en_s lt= 0

                                                  elsif(clk = 1 and clkevent)then

                                                  if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                  h_en_s lt= 1

                                                  else

                                                  h_en_s lt= 0

                                                  end if

                                                  end if

                                                  --end if

                                                  end process

                                                  ---vysnc pulse

                                                  process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                  begin

                                                  if(rstn = 0)then

                                                  48

                                                  v_en_s lt= 0

                                                  elsif(clk = 1 and clkevent)then

                                                  if((line_count_sgt=35 and line_count_slt=515))then

                                                  v_en_s lt= 1

                                                  else

                                                  v_en_s lt= 0

                                                  end if

                                                  end if

                                                  --end if

                                                  end process

                                                  process(clkrstn) --Vertical Sync Front porch Back Porch

                                                  begin

                                                  if(rstn = 0)then

                                                  vsync lt= 1

                                                  elsif(clk = 1 and clkevent)then

                                                  if(line_count_s lt2)then

                                                  vsync lt= 0

                                                  else

                                                  vsync lt= 1

                                                  end if

                                                  end if

                                                  --end if

                                                  end process

                                                  process(clk)

                                                  begin

                                                  if(clk = 1 and clkevent)then

                                                  v_en_reglt=v_en_s

                                                  end if

                                                  end process

                                                  end arch_sync_unit

                                                  49

                                                  PIXEL GENERATOR

                                                  library IEEE

                                                  use IEEESTD_LOGIC_1164ALL

                                                  use IEEESTD_LOGIC_ARITHALL

                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                  entity pixelgen is port(

                                                  clkin std_logic

                                                  rstnin std_logic

                                                  hsyncin std_logic

                                                  vsyncin std_logic

                                                  h_enin std_logic

                                                  v_enin std_logic

                                                  v_en_reg in std_logic

                                                  data_inin std_logic_vector(0 downto 0)

                                                  line_countin integer

                                                  c_hsin integer

                                                  red out std_logic_vector(0 downto 0)

                                                  greenout std_logic_vector(0 downto 0)

                                                  blueout std_logic_vector(0 downto 0)

                                                  addrout std_logic_vector(17 downto 0)

                                                  rdout std_logic

                                                  )

                                                  end pixelgen

                                                  architecture behv of pixelgen is

                                                  signal addr_sstd_logic_vector(17 downto 0)

                                                  begin

                                                  addr lt= addr_s

                                                  process(clkrstn)

                                                  begin

                                                  50

                                                  if(rstn = 0)then

                                                  rd lt= 0

                                                  red lt= 0

                                                  green lt= 0

                                                  blue lt= 0

                                                  addr_s lt= x0000 amp 00

                                                  elsif (clk = 1 and clkevent)then

                                                  if((v_en=1) and (v_en_reg=0))then

                                                  addr_slt=(others =gt0)

                                                  end if

                                                  if(v_en=1 and h_en=1)then

                                                  if(line_countlt=514)then

                                                  if(c_hs lt= 655)then

                                                  rd lt= 1

                                                  red lt= data_in

                                                  green lt= data_in

                                                  blue lt= data_in

                                                  addr_s lt=addr_s +000000000000000001

                                                  else

                                                  rd lt= 0

                                                  red lt= 0

                                                  green lt= 1

                                                  blue lt= 0

                                                  end if

                                                  else

                                                  rd lt= 0

                                                  red lt= 1

                                                  green lt= 1

                                                  blue lt= 1

                                                  51

                                                  end if

                                                  if(addr_s=111011111111111111)then

                                                  addr_slt= (others =gt0)

                                                  end if

                                                  else

                                                  red lt= 0

                                                  green lt= 0

                                                  blue lt= 0

                                                  end if

                                                  end if

                                                  -- end if

                                                  --end if

                                                  end process

                                                  end behv

                                                  52

                                                  12 FPGA ARCHITECTURE

                                                  A field-programmable gate array (FPGA) is a semiconductor device that can

                                                  be configured by the customer or designer after manufacturingmdashhence the

                                                  name field-programmable FPGAs are programmed using a logic circuit

                                                  diagram or a source code in a hardware description language (HDL) to specify

                                                  how the chip will work They can be used to implement any logical function

                                                  that an application-specific integrated circuit (ASIC) could perform but the

                                                  ability to update the functionality after shipping offers advantages for many

                                                  applications

                                                  FPGAs contain programmable logic components called logic blocks and a

                                                  hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                  togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                  can be configured to perform complex combinational functions or merely

                                                  simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                  include memory elements which may be simple flip-flops or more complete

                                                  blocks of memory

                                                  History The FPGA industry sprouted from programmable read only

                                                  memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                  both had the option of being programmed in batches in a factory or in the field

                                                  (field programmable) however programmable logic was hard-wired between

                                                  logic gates

                                                  Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                  first commercially viable field programmable gate array in 1985 ndash the

                                                  XC2064 The XC2064 had programmable gates and programmable

                                                  interconnects between gates the beginnings of a new technology and market

                                                  The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                  input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                  into the National Inventors Hall of Fame for his invention

                                                  53

                                                  Some of the industryrsquos foundational concepts and technologies for

                                                  programmable logic arrays gates and logic blocks are founded in patents

                                                  awarded to David W Page and LuVerne R Peterson in 1985

                                                  In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                  proposed by Steve Casselman to develop a computer that would implement

                                                  600000 reprogrammable gates Casselman was successful and the system was

                                                  awarded a patent in 1992

                                                  Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                  1990s when competitors sprouted up eroding significant market-share By

                                                  1993 Actel was serving about 18 percent of the market

                                                  The 1990s were an explosive period of time for FPGAs both in sophistication

                                                  and the volume of production In the early 1990s FPGAs were primarily used

                                                  in telecommunications and networking By the end of the decade FPGAs

                                                  found their way into consumer automotive and industrial applications

                                                  FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                  algorithm technology and FPGAs to create a sound recognition device

                                                  Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                  to decide the configuration needed to accomplish a sound recognition task

                                                  121 Modern developments

                                                  A recent trend has been to take the coarse-grained architectural

                                                  approach a step further by combining the logic blocks and interconnects of

                                                  traditional FPGAs with embedded microprocessors and related peripherals to

                                                  form a complete system on a programmable chip This work mirrors the

                                                  architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                  Group which combined a reconfigurable CPU architecture on a single chip

                                                  called the SB24 That work was done in 1982 Examples of such hybrid

                                                  technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                  which include one or more PowerPC processors embedded within the FPGAs

                                                  54

                                                  logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                  processor in combination with Atmels programmable logic architecture

                                                  An alternate approach to using hard-macro processors is to make use of soft

                                                  processor cores that are implemented within the FPGA logic (See Soft

                                                  processors below)

                                                  As previously mentioned many modern FPGAs have the ability to be

                                                  reprogrammed at run time and this is leading to the idea of reconfigurable

                                                  computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                  suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                  example of a reconfigurable soft processor implemented on FPGAs However

                                                  it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                  to a specific program

                                                  Additionally new non-FPGA architectures are beginning to emerge

                                                  Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                  hybrid approach by providing an array of processor cores and FPGA-like

                                                  programmable cores on the same chip

                                                  Gates

                                                  1987 9000 gates Xilinx

                                                  1992 600000 Naval Surface Warfare Department

                                                  Early 2000s Millions

                                                  Market size

                                                  1985 First commercial FPGA technology invented by Xilinx

                                                  1987 $14 million

                                                  ~1993 gt$385 million

                                                  2005 $19 billion

                                                  2010 estimates $275 billion

                                                  55

                                                  122 FPGA Comparisons

                                                  Historically FPGAs have been slower less energy efficient and generally

                                                  achieved less functionality than their fixed ASIC counterparts A combination

                                                  of volume fabrication improvements research and development and the IO

                                                  capabilities of new supercomputers have largely closed the performance gap

                                                  between ASICs and FPGAs

                                                  Advantages include a shorter time to market ability to re-program in the field

                                                  to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                  middle road by developing their hardware on ordinary FPGAs but

                                                  manufacture their final version so it can no longer be modified after the design

                                                  has been committed

                                                  Xilinx claims that several market and technology dynamics are changing the

                                                  ASICFPGA paradigm

                                                  IC costs are rising aggressively

                                                  ASIC complexity has bolstered development time and costs

                                                  RampD resources and headcount is decreasing

                                                  Revenue losses for slow time-to-market are increasing

                                                  Financial constraints in a poor economy are driving low-cost technologies

                                                  These trends make FPGAs a better alternative than ASICs for a growing

                                                  number of higher-volume applications than they have been historically used

                                                  for which the company blames for the growing number of FPGA design starts

                                                  (see History)

                                                  The primary differences between CPLDs and FPGAs are architectural A

                                                  CPLD has a somewhat restrictive structure consisting of one or more

                                                  programmable sum-of-products logic arrays feeding a relatively small number

                                                  of clocked registers The result of this is less flexibility with the advantage of

                                                  more predictable timing delays and a higher logic-to-interconnect ratio The

                                                  FPGA architectures on the other hand are dominated by interconnect This

                                                  56

                                                  makes them far more flexible (in terms of the range of designs that are

                                                  practical for implementation within them) but also far more complex to design

                                                  for

                                                  Another notable difference between CPLDs and FPGAs is the presence in most

                                                  FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                  and embedded memories as well as to have logic blocks implement decoders

                                                  or mathematical functions

                                                  Some FPGAs have the capability of partial re-configuration that lets one

                                                  portion of the device be re-programmed while other portions continue running

                                                  123 Applications

                                                  Applications of FPGAs include digital signal processing software-defined

                                                  radio aerospace and defense systems ASIC prototyping medical imaging

                                                  computer vision speech recognition cryptography bioinformatics computer

                                                  hardware emulation radio astronomy and a growing range of other areas

                                                  FPGAs originally began as competitors to CPLDs and competed in a similar

                                                  space that of glue logic for PCBs As their size capabilities and speed

                                                  increased they began to take over larger and larger functions to the state where

                                                  some are now marketed as full systems on chips (SoC) Particularly with the

                                                  introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                  applications which had traditionally been the sole reserve of DSPs began to

                                                  incorporate FPGAs instead

                                                  FPGAs especially find applications in any area or algorithm that can make use

                                                  of the massive parallelism offered by their architecture One such area is code

                                                  breaking in particular brute-force attack of cryptographic algorithms

                                                  FPGAs are increasingly used in conventional high performance computing

                                                  applications where computational kernels such as FFT or Convolution are

                                                  performed on the FPGA instead of a microprocessor

                                                  57

                                                  The inherent parallelism of the logic resources on an FPGA allows for

                                                  considerable computational throughput even at a low MHz clock rates The

                                                  flexibility of the FPGA allows for even higher performance by trading off

                                                  precision and range in the number format for an increased number of parallel

                                                  arithmetic units This has driven a new type of processing called

                                                  reconfigurable computing where time intensive tasks are offloaded from

                                                  software to FPGAs

                                                  The adoption of FPGAs in high performance computing is currently limited by

                                                  the complexity of FPGA design compared to conventional software and the

                                                  extremely long turn-around times of current design tools where 4-8 hours wait

                                                  is necessary after even minor changes to the source code

                                                  Traditionally FPGAs have been reserved for specific vertical applications

                                                  where the volume of production is small For these low-volume applications

                                                  the premium that companies pay in hardware costs per unit for a

                                                  programmable chip is more affordable than the development resources spent

                                                  on creating an ASIC for a low-volume application Today new cost and

                                                  performance dynamics have broadened the range of viable applications

                                                  Architecture

                                                  The most common FPGA architecture consists of an array of configurable

                                                  logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                  channels have the same width (number of wires) Multiple IO pads may fit

                                                  into the height of one row or the width of one column in the array

                                                  An application circuit must be mapped into an FPGA with adequate resources

                                                  While the number of CLBs and IOs required is easily determined from the

                                                  design the number of routing tracks needed may vary considerably even

                                                  among designs with the same amount of logic (For example a crossbar switch

                                                  requires much more routing than a systolic array with the same gate count)

                                                  Since unused routing tracks increase the cost (and decrease the performance)

                                                  of the part without providing any benefit FPGA manufacturers try to provide

                                                  58

                                                  just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                  can be routed This is determined by estimates such as those derived from

                                                  Rents rule or by experiments with existing designs

                                                  The FPGA is an array or island-style FPGA It consists of an array of logic

                                                  blocks and routing channels Two IO pads fit into the height of one row or the

                                                  width of one column as shown below All the routing channels have the same

                                                  width (number of wires)

                                                  121 FPGA structure

                                                  A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                  flip-flop as shown below In recent years manufacturers have started moving

                                                  to 6-input LUTs in their high performance parts claiming increased

                                                  performance

                                                  Typical logic block

                                                  59

                                                  There is only one output which can be either the registered or the unregistered

                                                  LUT output The logic block has four inputs for the LUT and a clock input

                                                  Since clock signals (and often other high-fanout signals) are normally routed

                                                  via special-purpose dedicated routing networks in commercial FPGAs they

                                                  and other signals are separately managed

                                                  For this example architecture the locations of the FPGA logic block pins are

                                                  shown below

                                                  Logic Block Pin Locations

                                                  Each input is accessible from one side of the logic block while the output pin

                                                  can connect to routing wires in both the channel to the right and the channel

                                                  below the logic block

                                                  Each logic block output pin can connect to any of the wiring segments in the

                                                  channels adjacent to it

                                                  Similarly an IO pad can connect to any one of the wiring segments in the

                                                  channel adjacent to it For example an IO pad at the top of the chip can

                                                  connect to any of the W wires (where W is the channel width) in the horizontal

                                                  channel immediately below it

                                                  Generally the FPGA routing is unsegmented That is each wiring segment

                                                  spans only one logic block before it terminates in a switch box By turning on

                                                  60

                                                  some of the programmable switches within a switch box longer paths can be

                                                  constructed For higher speed interconnect some FPGA architectures use

                                                  longer routing lines that span multiple logic blocks

                                                  Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                  this architecture when a wire enters a switch box there are three

                                                  programmable switches that allow it to connect to three other wires in adjacent

                                                  channel segments The pattern or topology of switches used in this

                                                  architecture is the planar or domain-based switch box topology In this switch

                                                  box topology a wire in track number one connects only to wires in track

                                                  number one in adjacent channel segments wires in track number 2 connect

                                                  only to other wires in track number 2 and so on The figure below illustrates

                                                  the connections in a switch box

                                                  Switch box topology

                                                  Modern FPGA families expand upon the above capabilities to include higher

                                                  level functionality fixed into the silicon Having these common functions

                                                  embedded into the silicon reduces the area required and gives those functions

                                                  increased speed compared to building them from primitives Examples of these

                                                  include multipliers generic DSP blocks embedded processors high speed IO

                                                  logic and embedded memories

                                                  61

                                                  FPGAs are also widely used for systems validation including pre-silicon

                                                  validation post-silicon validation and firmware development This allows chip

                                                  companies to validate their design before the chip is produced in the factory

                                                  reducing the time to market

                                                  124 FPGA Design and Programming

                                                  To define the behavior of the FPGA the user provides a hardware description

                                                  language (HDL) or a schematic design The HDL form might be easier to work

                                                  with when handling large structures because its possible to just specify them

                                                  numerically rather than having to draw every piece by hand On the other hand

                                                  schematic entry can allow for easier visualisation of a design

                                                  Then using an electronic design automation tool a technology-mapped netlist

                                                  is generated The netlist can then be fitted to the actual FPGA architecture

                                                  using a process called place-and-route usually performed by the FPGA

                                                  companys proprietary place-and-route software The user will validate the

                                                  map place and route results via timing analysis simulation and other

                                                  verification methodologies Once the design and validation process is

                                                  complete the binary file generated (also using the FPGA companys

                                                  proprietary software) is used to (re)configure the FPGA

                                                  Going from schematicHDL source files to actual configuration The source

                                                  files are fed to a software suite from the FPGACPLD vendor that through

                                                  different steps will produce a file This file is then transferred to the

                                                  FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                  like an EEPROM

                                                  The most common HDLs are VHDL and Verilog although in an attempt to

                                                  reduce the complexity of designing in HDLs which have been compared to the

                                                  equivalent of assembly languages there are moves to raise the abstraction level

                                                  through the introduction of alternative languages

                                                  62

                                                  To simplify the design of complex systems in FPGAs there exist libraries of

                                                  predefined complex functions and circuits that have been tested and optimized

                                                  to speed up the design process These predefined circuits are commonly called

                                                  IP cores and are available from FPGA vendors and third-party IP suppliers

                                                  (rarely free and typically released under proprietary licenses) Other

                                                  predefined circuits are available from developer communities such as Open

                                                  Cores (typically free and released under the GPL BSD or similar license) and

                                                  other sources

                                                  In a typical design flow an FPGA application developer will simulate the

                                                  design at multiple stages throughout the design process Initially the RTL

                                                  description in VHDL or Verilog is simulated by creating test benches to

                                                  simulate the system and observe results Then after the synthesis engine has

                                                  mapped the design to a netlist the netlist is translated to a gate level

                                                  description where simulation is repeated to confirm the synthesis proceeded

                                                  without errors Finally the design is laid out in the FPGA at which point

                                                  propagation delays can be added and the simulation run again with these

                                                  values back-annotated onto the netlist

                                                  Basic Process Technology Types

                                                  SRAM - based on static memory technology In-system programmable and re-

                                                  programmable Requires external boot devices CMOS

                                                  Antifuse - One-time programmable CMOS

                                                  EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                  one-time programmable in production because of plastic packaging

                                                  Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                  EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                  technology Can be erased even in plastic packages Some but not all

                                                  EEPROM devices can be in-system programmed CMOS

                                                  Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                  packages Some but not all flash devices can be in-system programmed

                                                  63

                                                  Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                  therefore less expensive to manufacture CMOS

                                                  Fuse - One-time programmable Bipolar

                                                  Major Manufacturers

                                                  Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                  rivals Together they control over 80 percent of the market with Xilinx alone

                                                  representing over 50 percent

                                                  Xilinx also provides free Windows and Linux design software while Altera

                                                  provides free Windows tools the Solaris and Linux tools are only available via

                                                  a rental scheme

                                                  Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                  (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                  Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                  focused CSSP no general purpose FPGAs)

                                                  125 FPGA prototype

                                                  FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                  prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                  hardware verification and early software development

                                                  Main stream verification methods for hardware design and early software and

                                                  firmware co-design has become mainstream Prototyping SoC and ASIC

                                                  design on FPGA has become a good method to do this

                                                  64

                                                  Reasons why Prototyping is important

                                                  1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                  functionally correct This is compared to designers only relying on software

                                                  simulations to verify that their hardware design is sound Simulation speed and

                                                  modeling accuracy limitations hinder this development

                                                  2 Due to time constrains many projects cannot wait until the silicon is back from

                                                  the foundry to start on software tests FPGA prototyping allows for much more

                                                  time in area of software development and testing at the software-hardware

                                                  integration stage This allows many unforeseen software bugs that appear due

                                                  to todays array of operating systems applications and hardware

                                                  3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                  system work well together off the simulation stage and in actual form

                                                  4 Prototyping has the added advantage as demo platforms to SoC clients

                                                  bringing in interest early This speeds up the overall development cycle and

                                                  allows for more enhancement or improvement to the chip features as it would

                                                  otherwise have been

                                                  65

                                                  13 SIMULATION RESULTS

                                                  66

                                                  GATE LEVEL

                                                  FigVGA CONTROLLER

                                                  67

                                                  GATE LEVEL

                                                  Fig SYNC UNIT

                                                  68

                                                  TECHNOLOGY SCHEMATIC

                                                  69

                                                  14 APPLICATIONS

                                                  Motion Tracking Systems

                                                  Image processing systems

                                                  Displaying Systems

                                                  141 CONCLUTION

                                                  Hence by implementing FPGA taking a processor-generated picture

                                                  (pixilated) from memory space the Controller provides digital RGB values for

                                                  each pixel as well as horizontal and vertical synchronization signals in order

                                                  to correctly display the picture on a connected monitor

                                                  70

                                                  REFERENCES

                                                  1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                  2 A VGA display controller by Eduardo Sanchez

                                                  3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                  71

                                                  • Pixel
                                                    • 6 SIGNALS IN VGA CONTROLLER
                                                    • Gates
                                                    • Market size
                                                    • 122 FPGA Comparisons

                                                    26

                                                    27

                                                    The RGB color code stored in the RGB register is output from the VGA

                                                    Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                                    and B1)

                                                    The monitor itself expects analog signals as inputs to its electron gun control

                                                    circuits This is achieved by using 2-bit digital to analog converter circuitry

                                                    located on the Nano Board itself as shown in Figure 4

                                                    Figure 91 Digital to Analog RGB conversion

                                                    28

                                                    For each color the 2-bit digital signal from the VGA Controller can be

                                                    converted into 4 distinct analog levels These levels specify the intensity of

                                                    each of the three primary colors to use when displaying the pixel on the

                                                    monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                                    brightness)

                                                    With each analog input being one of four possible levels the monitor

                                                    can display each pixel on the screen with one of 64 different color

                                                    permutations

                                                    Using Block RAM in Spartan-3E Generation

                                                    FPGAs

                                                    For applications requiring large on-chip memories Spartantrade-3

                                                    Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                                    Using various configuration options Select RAM blocks create RAM ROM

                                                    FIFOs large look-up tables data width converterscircular buffers and shift

                                                    registers each supporting various data widths and depths This application note

                                                    describes the features and capabilities of block Select RAM and illustrates how

                                                    to specify the various options using the Xilinx CORE Generatortrade system or

                                                    via VHDL or Verilog instantiation

                                                    29

                                                    Table 2 Block RAM Available in Spartan-3E Devices

                                                    Devi

                                                    ce

                                                    R

                                                    A

                                                    M

                                                    Co

                                                    lu

                                                    m

                                                    ns

                                                    R

                                                    A

                                                    M

                                                    Bl

                                                    oc

                                                    ks

                                                    pe

                                                    r

                                                    C

                                                    ol

                                                    u

                                                    m

                                                    n

                                                    T

                                                    o

                                                    t

                                                    a

                                                    l

                                                    R

                                                    A

                                                    M

                                                    B

                                                    l

                                                    o

                                                    c

                                                    k

                                                    s

                                                    T

                                                    o

                                                    t

                                                    a

                                                    l

                                                    R

                                                    A

                                                    M

                                                    B

                                                    it

                                                    s

                                                    T

                                                    o

                                                    t

                                                    a

                                                    l

                                                    R

                                                    A

                                                    M

                                                    K

                                                    b

                                                    i

                                                    t

                                                    s

                                                    XC3

                                                    S100

                                                    E

                                                    1 4 4 7

                                                    3

                                                    7

                                                    2

                                                    8

                                                    7

                                                    2

                                                    XC3

                                                    S250

                                                    E

                                                    2 6 1

                                                    2

                                                    2

                                                    2

                                                    1

                                                    1

                                                    8

                                                    4

                                                    2

                                                    1

                                                    6

                                                    XC3

                                                    S500

                                                    E

                                                    2 10 2

                                                    0

                                                    3

                                                    6

                                                    8

                                                    6

                                                    4

                                                    3

                                                    6

                                                    0

                                                    30

                                                    0

                                                    XC3

                                                    S120

                                                    0E

                                                    2 14 2

                                                    8

                                                    5

                                                    1

                                                    6

                                                    0

                                                    9

                                                    6

                                                    5

                                                    0

                                                    4

                                                    XC3

                                                    S160

                                                    0E

                                                    2 18 3

                                                    6

                                                    6

                                                    6

                                                    3

                                                    5

                                                    5

                                                    2

                                                    6

                                                    4

                                                    8

                                                    Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                                    which is allocated to data storage and in some memory configurations an

                                                    additional 2K bits allocated to parity or additional plus data bits Physically

                                                    the block RAM memory has two completely independent access ports labeled

                                                    Port A and Port B The structure is fully symmetrical and both ports are

                                                    interchangeable and both ports support data read and write operations Each

                                                    memory port is synchronous with its own clock clock enable and write

                                                    enable Read operations are also synchronous and require a clock edge and

                                                    clock enable Though physically a dual-port memory block RAM simulates

                                                    single-port memory in an application as shown in Figure 1 Furthermore each

                                                    block memory supports multiple configurations or aspect ratios Table 3

                                                    summarizes the essential SelectRAM features Cascade multiple block RAMs

                                                    to create deeper and wider memory organizations with a minimal timing

                                                    penalty incurred through specialized routing resources

                                                    31

                                                    32

                                                    10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                    Why (V) HDL

                                                    Interoperability

                                                    Technology independence

                                                    Design reuse

                                                    Several levels of abstraction

                                                    Readability

                                                    Standard language

                                                    Widely supported

                                                    What is VHDL

                                                    VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                    IC)

                                                    Design specification language

                                                    Design entry language

                                                    Design simulation language

                                                    Design documentation language

                                                    An alternative to schematics

                                                    101 Brief History

                                                    VHDL Was developed in the early 1980s for managing design problems that

                                                    involved large circuits and multiple teams of engineers

                                                    Funded by US Department of Defence

                                                    33

                                                    The first publicly available version was released in 1985

                                                    In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                    presented with a proposal to standardize the VHDL

                                                    In 1987 standardization =gt IEEE 1076-1987

                                                    An improved version of the language was released in 1994 =gt IEEE

                                                    standard1076-1993

                                                    102 Related Standards

                                                    IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                    impedance

                                                    Soon after IEEE 1076-1987 was released simulator companies began using

                                                    their own non-standard types =gt VHDL was becoming a nonstandard

                                                    IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                    definitions for a nine-valued data type std_logic

                                                    IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                    to actual hardware

                                                    Defines eg two numeric types signed and unsigned

                                                    VHDL Environment

                                                    Design Units

                                                    34

                                                    Segments of VHDL code that can be compiled separately and stored in a

                                                    library

                                                    Entities

                                                    A black box with interface definition

                                                    Defines the inputsoutputs of a component (define pins)

                                                    A way to represent modularity in VHDL

                                                    Similar to symbol in schematic

                                                    Entity declaration describes entity

                                                    Eg

                                                    entity Comparator is

                                                    port (A B in std_logic_vector(7 downto0)

                                                    EQ out std_logic)

                                                    end Comparator

                                                    Ports Provide channels of communication between the component and its

                                                    environment

                                                    Each port must have a name direction and a type

                                                    An entity may have NO port declaration

                                                    35

                                                    Port directions In A value of a port can be read inside the component but cannot be assigned

                                                    Multiple reads of port are allowed

                                                    Out Assignments can be made to a port but data from a port cannot be read

                                                    Multiple assignments are allowed

                                                    In out Bi-directional assignments can be made and data can be read Multiple

                                                    assignments are allowed

                                                    Buffer An out port with read capability May have at most one assignment

                                                    (are not recommended)

                                                    Architectures Every entity has at least one architecture

                                                    One entity can have several architectures

                                                    Architectures can describe design using

                                                    BehaviorndashStructurendashDataflow

                                                    Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                    Transfer Level)ndashBehavioral level

                                                    Configuration declaration links architecture to entity

                                                    Eg

                                                    Architecture Comparator1 of Comparator is

                                                    Begin

                                                    EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                    End Comparator1

                                                    Configurations Links entity declaration and architecture body together

                                                    Concept of default configuration is a bit messy in VHDL lsquo87

                                                    ndashLast architecture analyzed links to entity

                                                    Can be used to change simulation behavior without re-analyzing the VHDL

                                                    source

                                                    Complex configuration declarations are ignored in synthesis

                                                    36

                                                    Some entities can have eggate level architecture and behavioral

                                                    architecture

                                                    Are always optional

                                                    PackagesPackages contain information common to many design units

                                                    1 Package declaration

                                                    --constant declarations

                                                    ndashtype and subtype declarations

                                                    ndashfunction and procedure declarations

                                                    ndashglobal signal declarations

                                                    ndashfile declarations

                                                    ndashcomponent declarations

                                                    2 Package body

                                                    ndashis not necessary needed

                                                    ndashfunction bodies

                                                    ndashprocedure bodies

                                                    Packages are meant for encapsuling data which can be shared globally among

                                                    Several design units

                                                    Consists of declaration part and optional body part

                                                    Package declaration can contain

                                                    ndashtype and subtype declarations

                                                    ndashsubprograms

                                                    ndashconstants

                                                    Alias declarations

                                                    ndashglobal signal declarations

                                                    ndashfile declarations

                                                    ndashcomponent declarations

                                                    37

                                                    Package body consists of

                                                    ndashsubprogram declarations and bodies

                                                    ndashtype and subtype declarations

                                                    ndash deferred constants

                                                    ndash file declarations

                                                    LibrariesCollection of VHDL design units (database)

                                                    1 Packages

                                                    package declaration

                                                    package body

                                                    2 Entities (entity declaration)

                                                    3 Architectures (architecture body)

                                                    4 Configurations (configuration declarations)

                                                    Usually directory in UNIX file system

                                                    Can be also any other kind of database

                                                    Levels of Abstraction

                                                    VHDL supports many possible styles of design description which differ

                                                    primarily in how closely they relate to the HW

                                                    It is possible to describe a circuit in a number of ways

                                                    Structural-------

                                                    Dataflow ------- Higher level of abstraction

                                                    Behavioral -------

                                                    Structural VHDL description

                                                    Circuit is described in terms of its components

                                                    From a low-level description (eg transistor-level description) to a high level

                                                    description (eg block diagram)

                                                    For large circuits a low-level description quickly becomes impractical

                                                    103 Dataflow VHDL Description

                                                    38

                                                    Circuit is described in terms of how data moves through the system

                                                    In the dataflow style you describe how information flows between registers in

                                                    the system

                                                    The combinational logic is described at a relatively high level the placement

                                                    and operation of registers is specified quite precisely

                                                    The behavior of the system over the time is defined by registers

                                                    There are no build-in registers in VHDL-language

                                                    ndashEither lower level description

                                                    ndashor behavioral description of sequential elements is needed

                                                    The lower level register descriptions must be created or obtained

                                                    If there is no 3rd party models for registers =gt you must write the behavioral

                                                    description of registers

                                                    The behavioral description can be provided in the form of

                                                    subprograms(functions or procedures)

                                                    104 Behavioral VHDL Description

                                                    Circuit is described in terms of its operation over time

                                                    39

                                                    Representation might include eg state diagrams timing diagrams and

                                                    algorithmic descriptions

                                                    The concept of time may be expressed precisely using delays (eg A lt= B

                                                    after 10 ns)

                                                    If no actual delay is used order of sequential operations is defined

                                                    In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                    timing specifications

                                                    The actual timing results depend on implementation technology and efficiency

                                                    of synthesis tool

                                                    There are a few tools for behavioral synthesis

                                                    Concurrent Vs Sequential

                                                    Processes

                                                    Basic simulation concept in VHDL

                                                    VHDL description can always be broken up to interconnected processes

                                                    Quite similar to Unix process

                                                    40

                                                    Process keyword in VHDL

                                                    Process statement is concurrent statement

                                                    Statements inside process statements are sequential statements

                                                    Process must contain either sensitivity list or wait statement(s) but NOT both

                                                    Sensitivity list or wait statement(s) contains signals which wakes process up

                                                    General format

                                                    Process [(sensitivity_list)]

                                                    process_declarative_part

                                                    Begin

                                                    process_statements

                                                    [wait_statement]

                                                    End process

                                                    41

                                                    11 VGA CONTROLLER CODE

                                                    library IEEE

                                                    use IEEESTD_LOGIC_1164ALL

                                                    use IEEESTD_LOGIC_ARITHALL

                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                    use IEEEnumeric_stdALL

                                                    ---- Uncomment the following library declaration if instantiating

                                                    ---- any Xilinx primitives in this code

                                                    --library UNISIM

                                                    --use UNISIMVComponentsall

                                                    entity vga_controller is port(clk50min std_logic

                                                    rstnin std_logic

                                                    hsyncout std_logic

                                                    vsyncout std_logic

                                                    redout std_logic_vector(0 downto 0)

                                                    greenout std_logic_vector(0 downto 0)

                                                    blueout std_logic_vector(0 downto 0))

                                                    end vga_controller

                                                    architecture Behavioral of vga_controller is

                                                    component memory

                                                    port (

                                                    addr IN std_logic_VECTOR(17 downto 0)

                                                    clk IN std_logic

                                                    dout OUT std_logic_VECTOR(0 downto 0)

                                                    en IN std_logic

                                                    sinit IN std_logic)

                                                    END component

                                                    42

                                                    component sync_unit port(

                                                    clkin std_logic

                                                    rstnin std_logic

                                                    hsyncout std_logic

                                                    ---horizontal synch pulse(row)

                                                    vsyncout std_logic

                                                    --vertical sync pulse(frame) 111110100000000000

                                                    v_en_regout std_logic

                                                    h_enout std_logic

                                                    v_enout std_logic

                                                    line_countinout integer

                                                    c_hsinout integer)

                                                    end component

                                                    component pixelgen port(

                                                    clkin std_logic

                                                    rstnin std_logic

                                                    hsyncin std_logic

                                                    vsyncin std_logic

                                                    h_enin std_logic

                                                    v_enin std_logic

                                                    v_en_reg in std_logic

                                                    data_inin std_logic_vector(0 downto 0)

                                                    line_countin integer

                                                    c_hsin integer

                                                    red out std_logic_vector(0 downto 0)

                                                    greenout std_logic_vector(0 downto 0)

                                                    blueout std_logic_vector(0 downto 0)

                                                    addrout std_logic_vector(17 downto 0)

                                                    rdout std_logic)

                                                    end component

                                                    43

                                                    signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                    signal line_count_sc_hs_saddrinteger

                                                    signal data_sstd_logic_vector(0 downto 0)

                                                    signal addr_sstd_logic_vector(17 downto 0)

                                                    begin

                                                    --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                    process(clk50mrstn)

                                                    begin

                                                    if(rstn = 0)then

                                                    clk lt= 0

                                                    elsif(clk50m = 1 and clk50mevent)then

                                                    clk lt= not clk

                                                    end if

                                                    --end if

                                                    end process

                                                    hsync lt= hsync_s

                                                    vsync lt= vsync_s

                                                    sync sync_unit port map(

                                                    clk =gt clk

                                                    rstn =gt rstn

                                                    hsync =gt hsync_s

                                                    vsync =gt vsync_s

                                                    v_en_reg =gt v_en_reg_s

                                                    line_count =gt line_count_s

                                                    h_en =gt h_en_s

                                                    v_en =gt v_en_s

                                                    c_hs =gt c_hs_s

                                                    )

                                                    44

                                                    pixel pixelgen port map(

                                                    clk =gt clk

                                                    rstn =gt rstn

                                                    hsync =gt hsync_s

                                                    vsync =gt vsync_s

                                                    h_en =gt h_en_s

                                                    v_en =gt v_en_s

                                                    v_en_reg =gt v_en_reg_s

                                                    data_in =gt data_s

                                                    line_count=gt line_count_s

                                                    c_hs =gt c_hs_s

                                                    red =gt red

                                                    green =gt green

                                                    blue =gt blue

                                                    addr =gt addr_s

                                                    rd =gt rd_s

                                                    )

                                                    mem12 memory port map(

                                                    addr =gt addr_s

                                                    clk =gt clk

                                                    dout =gt data_s

                                                    en =gt rd_s

                                                    sinit=gt rstn

                                                    )

                                                    end Behavioral

                                                    45

                                                    SYNCHRONIZATION BLOCK CODE

                                                    library IEEE

                                                    use IEEESTD_LOGIC_1164ALL

                                                    use IEEESTD_LOGIC_ARITHALL

                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                    entity sync_unit is port(

                                                    clkin std_logic

                                                    rstnin std_logic

                                                    hsyncout std_logic ---horizontal synch pulse(row)

                                                    vsyncout std_logic --vertical sync

                                                    pulse(frame) 111110100000000000

                                                    v_en_regout std_logic

                                                    h_enout std_logic

                                                    v_enout std_logic

                                                    line_countout integer

                                                    c_hsout integer )

                                                    end sync_unit

                                                    architecture arch_sync_unit of sync_unit is

                                                    signal h_en_sv_en_sstd_logic

                                                    signal line_count_sc_hs_sinteger

                                                    begin

                                                    c_hs lt= c_hs_s

                                                    line_count lt= line_count_s

                                                    h_en lt= h_en_s

                                                    v_en lt= v_en_s

                                                    -- hsync counter

                                                    process(clkrstn)

                                                    begin

                                                    if(rstn = 0)then

                                                    46

                                                    c_hs_s lt= 0

                                                    else

                                                    if(clk = 1 and clkevent)then

                                                    if(c_hs_s=793)then

                                                    c_hs_s lt= 0

                                                    else

                                                    c_hs_s lt= c_hs_s +1

                                                    end if

                                                    end if

                                                    end if

                                                    end process

                                                    --vsync counter

                                                    --vertical line counter

                                                    process(h_en_srstn)

                                                    begin

                                                    if(rstn = 0)then

                                                    line_count_s lt= 0

                                                    elsif(h_en_s = 1 and h_en_sevent)then

                                                    line_count_s lt= line_count_s + 1

                                                    if (line_count_s=515)then

                                                    line_count_s lt= 0

                                                    end if

                                                    end if

                                                    --end if

                                                    end process

                                                    --hysnc pulse

                                                    process(clkrstn)

                                                    begin

                                                    if(rstn = 0)then

                                                    47

                                                    hsync lt= 1

                                                    elsif(clk = 1 and clkevent)then

                                                    if(c_hs_s lt=95)then

                                                    hsync lt= 0

                                                    else

                                                    hsync lt= 1

                                                    end if

                                                    end if

                                                    --end if

                                                    end process

                                                    process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                    this period)

                                                    begin

                                                    if(rstn = 0)then

                                                    h_en_s lt= 0

                                                    elsif(clk = 1 and clkevent)then

                                                    if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                    h_en_s lt= 1

                                                    else

                                                    h_en_s lt= 0

                                                    end if

                                                    end if

                                                    --end if

                                                    end process

                                                    ---vysnc pulse

                                                    process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                    begin

                                                    if(rstn = 0)then

                                                    48

                                                    v_en_s lt= 0

                                                    elsif(clk = 1 and clkevent)then

                                                    if((line_count_sgt=35 and line_count_slt=515))then

                                                    v_en_s lt= 1

                                                    else

                                                    v_en_s lt= 0

                                                    end if

                                                    end if

                                                    --end if

                                                    end process

                                                    process(clkrstn) --Vertical Sync Front porch Back Porch

                                                    begin

                                                    if(rstn = 0)then

                                                    vsync lt= 1

                                                    elsif(clk = 1 and clkevent)then

                                                    if(line_count_s lt2)then

                                                    vsync lt= 0

                                                    else

                                                    vsync lt= 1

                                                    end if

                                                    end if

                                                    --end if

                                                    end process

                                                    process(clk)

                                                    begin

                                                    if(clk = 1 and clkevent)then

                                                    v_en_reglt=v_en_s

                                                    end if

                                                    end process

                                                    end arch_sync_unit

                                                    49

                                                    PIXEL GENERATOR

                                                    library IEEE

                                                    use IEEESTD_LOGIC_1164ALL

                                                    use IEEESTD_LOGIC_ARITHALL

                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                    entity pixelgen is port(

                                                    clkin std_logic

                                                    rstnin std_logic

                                                    hsyncin std_logic

                                                    vsyncin std_logic

                                                    h_enin std_logic

                                                    v_enin std_logic

                                                    v_en_reg in std_logic

                                                    data_inin std_logic_vector(0 downto 0)

                                                    line_countin integer

                                                    c_hsin integer

                                                    red out std_logic_vector(0 downto 0)

                                                    greenout std_logic_vector(0 downto 0)

                                                    blueout std_logic_vector(0 downto 0)

                                                    addrout std_logic_vector(17 downto 0)

                                                    rdout std_logic

                                                    )

                                                    end pixelgen

                                                    architecture behv of pixelgen is

                                                    signal addr_sstd_logic_vector(17 downto 0)

                                                    begin

                                                    addr lt= addr_s

                                                    process(clkrstn)

                                                    begin

                                                    50

                                                    if(rstn = 0)then

                                                    rd lt= 0

                                                    red lt= 0

                                                    green lt= 0

                                                    blue lt= 0

                                                    addr_s lt= x0000 amp 00

                                                    elsif (clk = 1 and clkevent)then

                                                    if((v_en=1) and (v_en_reg=0))then

                                                    addr_slt=(others =gt0)

                                                    end if

                                                    if(v_en=1 and h_en=1)then

                                                    if(line_countlt=514)then

                                                    if(c_hs lt= 655)then

                                                    rd lt= 1

                                                    red lt= data_in

                                                    green lt= data_in

                                                    blue lt= data_in

                                                    addr_s lt=addr_s +000000000000000001

                                                    else

                                                    rd lt= 0

                                                    red lt= 0

                                                    green lt= 1

                                                    blue lt= 0

                                                    end if

                                                    else

                                                    rd lt= 0

                                                    red lt= 1

                                                    green lt= 1

                                                    blue lt= 1

                                                    51

                                                    end if

                                                    if(addr_s=111011111111111111)then

                                                    addr_slt= (others =gt0)

                                                    end if

                                                    else

                                                    red lt= 0

                                                    green lt= 0

                                                    blue lt= 0

                                                    end if

                                                    end if

                                                    -- end if

                                                    --end if

                                                    end process

                                                    end behv

                                                    52

                                                    12 FPGA ARCHITECTURE

                                                    A field-programmable gate array (FPGA) is a semiconductor device that can

                                                    be configured by the customer or designer after manufacturingmdashhence the

                                                    name field-programmable FPGAs are programmed using a logic circuit

                                                    diagram or a source code in a hardware description language (HDL) to specify

                                                    how the chip will work They can be used to implement any logical function

                                                    that an application-specific integrated circuit (ASIC) could perform but the

                                                    ability to update the functionality after shipping offers advantages for many

                                                    applications

                                                    FPGAs contain programmable logic components called logic blocks and a

                                                    hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                    togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                    can be configured to perform complex combinational functions or merely

                                                    simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                    include memory elements which may be simple flip-flops or more complete

                                                    blocks of memory

                                                    History The FPGA industry sprouted from programmable read only

                                                    memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                    both had the option of being programmed in batches in a factory or in the field

                                                    (field programmable) however programmable logic was hard-wired between

                                                    logic gates

                                                    Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                    first commercially viable field programmable gate array in 1985 ndash the

                                                    XC2064 The XC2064 had programmable gates and programmable

                                                    interconnects between gates the beginnings of a new technology and market

                                                    The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                    input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                    into the National Inventors Hall of Fame for his invention

                                                    53

                                                    Some of the industryrsquos foundational concepts and technologies for

                                                    programmable logic arrays gates and logic blocks are founded in patents

                                                    awarded to David W Page and LuVerne R Peterson in 1985

                                                    In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                    proposed by Steve Casselman to develop a computer that would implement

                                                    600000 reprogrammable gates Casselman was successful and the system was

                                                    awarded a patent in 1992

                                                    Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                    1990s when competitors sprouted up eroding significant market-share By

                                                    1993 Actel was serving about 18 percent of the market

                                                    The 1990s were an explosive period of time for FPGAs both in sophistication

                                                    and the volume of production In the early 1990s FPGAs were primarily used

                                                    in telecommunications and networking By the end of the decade FPGAs

                                                    found their way into consumer automotive and industrial applications

                                                    FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                    algorithm technology and FPGAs to create a sound recognition device

                                                    Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                    to decide the configuration needed to accomplish a sound recognition task

                                                    121 Modern developments

                                                    A recent trend has been to take the coarse-grained architectural

                                                    approach a step further by combining the logic blocks and interconnects of

                                                    traditional FPGAs with embedded microprocessors and related peripherals to

                                                    form a complete system on a programmable chip This work mirrors the

                                                    architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                    Group which combined a reconfigurable CPU architecture on a single chip

                                                    called the SB24 That work was done in 1982 Examples of such hybrid

                                                    technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                    which include one or more PowerPC processors embedded within the FPGAs

                                                    54

                                                    logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                    processor in combination with Atmels programmable logic architecture

                                                    An alternate approach to using hard-macro processors is to make use of soft

                                                    processor cores that are implemented within the FPGA logic (See Soft

                                                    processors below)

                                                    As previously mentioned many modern FPGAs have the ability to be

                                                    reprogrammed at run time and this is leading to the idea of reconfigurable

                                                    computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                    suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                    example of a reconfigurable soft processor implemented on FPGAs However

                                                    it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                    to a specific program

                                                    Additionally new non-FPGA architectures are beginning to emerge

                                                    Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                    hybrid approach by providing an array of processor cores and FPGA-like

                                                    programmable cores on the same chip

                                                    Gates

                                                    1987 9000 gates Xilinx

                                                    1992 600000 Naval Surface Warfare Department

                                                    Early 2000s Millions

                                                    Market size

                                                    1985 First commercial FPGA technology invented by Xilinx

                                                    1987 $14 million

                                                    ~1993 gt$385 million

                                                    2005 $19 billion

                                                    2010 estimates $275 billion

                                                    55

                                                    122 FPGA Comparisons

                                                    Historically FPGAs have been slower less energy efficient and generally

                                                    achieved less functionality than their fixed ASIC counterparts A combination

                                                    of volume fabrication improvements research and development and the IO

                                                    capabilities of new supercomputers have largely closed the performance gap

                                                    between ASICs and FPGAs

                                                    Advantages include a shorter time to market ability to re-program in the field

                                                    to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                    middle road by developing their hardware on ordinary FPGAs but

                                                    manufacture their final version so it can no longer be modified after the design

                                                    has been committed

                                                    Xilinx claims that several market and technology dynamics are changing the

                                                    ASICFPGA paradigm

                                                    IC costs are rising aggressively

                                                    ASIC complexity has bolstered development time and costs

                                                    RampD resources and headcount is decreasing

                                                    Revenue losses for slow time-to-market are increasing

                                                    Financial constraints in a poor economy are driving low-cost technologies

                                                    These trends make FPGAs a better alternative than ASICs for a growing

                                                    number of higher-volume applications than they have been historically used

                                                    for which the company blames for the growing number of FPGA design starts

                                                    (see History)

                                                    The primary differences between CPLDs and FPGAs are architectural A

                                                    CPLD has a somewhat restrictive structure consisting of one or more

                                                    programmable sum-of-products logic arrays feeding a relatively small number

                                                    of clocked registers The result of this is less flexibility with the advantage of

                                                    more predictable timing delays and a higher logic-to-interconnect ratio The

                                                    FPGA architectures on the other hand are dominated by interconnect This

                                                    56

                                                    makes them far more flexible (in terms of the range of designs that are

                                                    practical for implementation within them) but also far more complex to design

                                                    for

                                                    Another notable difference between CPLDs and FPGAs is the presence in most

                                                    FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                    and embedded memories as well as to have logic blocks implement decoders

                                                    or mathematical functions

                                                    Some FPGAs have the capability of partial re-configuration that lets one

                                                    portion of the device be re-programmed while other portions continue running

                                                    123 Applications

                                                    Applications of FPGAs include digital signal processing software-defined

                                                    radio aerospace and defense systems ASIC prototyping medical imaging

                                                    computer vision speech recognition cryptography bioinformatics computer

                                                    hardware emulation radio astronomy and a growing range of other areas

                                                    FPGAs originally began as competitors to CPLDs and competed in a similar

                                                    space that of glue logic for PCBs As their size capabilities and speed

                                                    increased they began to take over larger and larger functions to the state where

                                                    some are now marketed as full systems on chips (SoC) Particularly with the

                                                    introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                    applications which had traditionally been the sole reserve of DSPs began to

                                                    incorporate FPGAs instead

                                                    FPGAs especially find applications in any area or algorithm that can make use

                                                    of the massive parallelism offered by their architecture One such area is code

                                                    breaking in particular brute-force attack of cryptographic algorithms

                                                    FPGAs are increasingly used in conventional high performance computing

                                                    applications where computational kernels such as FFT or Convolution are

                                                    performed on the FPGA instead of a microprocessor

                                                    57

                                                    The inherent parallelism of the logic resources on an FPGA allows for

                                                    considerable computational throughput even at a low MHz clock rates The

                                                    flexibility of the FPGA allows for even higher performance by trading off

                                                    precision and range in the number format for an increased number of parallel

                                                    arithmetic units This has driven a new type of processing called

                                                    reconfigurable computing where time intensive tasks are offloaded from

                                                    software to FPGAs

                                                    The adoption of FPGAs in high performance computing is currently limited by

                                                    the complexity of FPGA design compared to conventional software and the

                                                    extremely long turn-around times of current design tools where 4-8 hours wait

                                                    is necessary after even minor changes to the source code

                                                    Traditionally FPGAs have been reserved for specific vertical applications

                                                    where the volume of production is small For these low-volume applications

                                                    the premium that companies pay in hardware costs per unit for a

                                                    programmable chip is more affordable than the development resources spent

                                                    on creating an ASIC for a low-volume application Today new cost and

                                                    performance dynamics have broadened the range of viable applications

                                                    Architecture

                                                    The most common FPGA architecture consists of an array of configurable

                                                    logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                    channels have the same width (number of wires) Multiple IO pads may fit

                                                    into the height of one row or the width of one column in the array

                                                    An application circuit must be mapped into an FPGA with adequate resources

                                                    While the number of CLBs and IOs required is easily determined from the

                                                    design the number of routing tracks needed may vary considerably even

                                                    among designs with the same amount of logic (For example a crossbar switch

                                                    requires much more routing than a systolic array with the same gate count)

                                                    Since unused routing tracks increase the cost (and decrease the performance)

                                                    of the part without providing any benefit FPGA manufacturers try to provide

                                                    58

                                                    just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                    can be routed This is determined by estimates such as those derived from

                                                    Rents rule or by experiments with existing designs

                                                    The FPGA is an array or island-style FPGA It consists of an array of logic

                                                    blocks and routing channels Two IO pads fit into the height of one row or the

                                                    width of one column as shown below All the routing channels have the same

                                                    width (number of wires)

                                                    121 FPGA structure

                                                    A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                    flip-flop as shown below In recent years manufacturers have started moving

                                                    to 6-input LUTs in their high performance parts claiming increased

                                                    performance

                                                    Typical logic block

                                                    59

                                                    There is only one output which can be either the registered or the unregistered

                                                    LUT output The logic block has four inputs for the LUT and a clock input

                                                    Since clock signals (and often other high-fanout signals) are normally routed

                                                    via special-purpose dedicated routing networks in commercial FPGAs they

                                                    and other signals are separately managed

                                                    For this example architecture the locations of the FPGA logic block pins are

                                                    shown below

                                                    Logic Block Pin Locations

                                                    Each input is accessible from one side of the logic block while the output pin

                                                    can connect to routing wires in both the channel to the right and the channel

                                                    below the logic block

                                                    Each logic block output pin can connect to any of the wiring segments in the

                                                    channels adjacent to it

                                                    Similarly an IO pad can connect to any one of the wiring segments in the

                                                    channel adjacent to it For example an IO pad at the top of the chip can

                                                    connect to any of the W wires (where W is the channel width) in the horizontal

                                                    channel immediately below it

                                                    Generally the FPGA routing is unsegmented That is each wiring segment

                                                    spans only one logic block before it terminates in a switch box By turning on

                                                    60

                                                    some of the programmable switches within a switch box longer paths can be

                                                    constructed For higher speed interconnect some FPGA architectures use

                                                    longer routing lines that span multiple logic blocks

                                                    Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                    this architecture when a wire enters a switch box there are three

                                                    programmable switches that allow it to connect to three other wires in adjacent

                                                    channel segments The pattern or topology of switches used in this

                                                    architecture is the planar or domain-based switch box topology In this switch

                                                    box topology a wire in track number one connects only to wires in track

                                                    number one in adjacent channel segments wires in track number 2 connect

                                                    only to other wires in track number 2 and so on The figure below illustrates

                                                    the connections in a switch box

                                                    Switch box topology

                                                    Modern FPGA families expand upon the above capabilities to include higher

                                                    level functionality fixed into the silicon Having these common functions

                                                    embedded into the silicon reduces the area required and gives those functions

                                                    increased speed compared to building them from primitives Examples of these

                                                    include multipliers generic DSP blocks embedded processors high speed IO

                                                    logic and embedded memories

                                                    61

                                                    FPGAs are also widely used for systems validation including pre-silicon

                                                    validation post-silicon validation and firmware development This allows chip

                                                    companies to validate their design before the chip is produced in the factory

                                                    reducing the time to market

                                                    124 FPGA Design and Programming

                                                    To define the behavior of the FPGA the user provides a hardware description

                                                    language (HDL) or a schematic design The HDL form might be easier to work

                                                    with when handling large structures because its possible to just specify them

                                                    numerically rather than having to draw every piece by hand On the other hand

                                                    schematic entry can allow for easier visualisation of a design

                                                    Then using an electronic design automation tool a technology-mapped netlist

                                                    is generated The netlist can then be fitted to the actual FPGA architecture

                                                    using a process called place-and-route usually performed by the FPGA

                                                    companys proprietary place-and-route software The user will validate the

                                                    map place and route results via timing analysis simulation and other

                                                    verification methodologies Once the design and validation process is

                                                    complete the binary file generated (also using the FPGA companys

                                                    proprietary software) is used to (re)configure the FPGA

                                                    Going from schematicHDL source files to actual configuration The source

                                                    files are fed to a software suite from the FPGACPLD vendor that through

                                                    different steps will produce a file This file is then transferred to the

                                                    FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                    like an EEPROM

                                                    The most common HDLs are VHDL and Verilog although in an attempt to

                                                    reduce the complexity of designing in HDLs which have been compared to the

                                                    equivalent of assembly languages there are moves to raise the abstraction level

                                                    through the introduction of alternative languages

                                                    62

                                                    To simplify the design of complex systems in FPGAs there exist libraries of

                                                    predefined complex functions and circuits that have been tested and optimized

                                                    to speed up the design process These predefined circuits are commonly called

                                                    IP cores and are available from FPGA vendors and third-party IP suppliers

                                                    (rarely free and typically released under proprietary licenses) Other

                                                    predefined circuits are available from developer communities such as Open

                                                    Cores (typically free and released under the GPL BSD or similar license) and

                                                    other sources

                                                    In a typical design flow an FPGA application developer will simulate the

                                                    design at multiple stages throughout the design process Initially the RTL

                                                    description in VHDL or Verilog is simulated by creating test benches to

                                                    simulate the system and observe results Then after the synthesis engine has

                                                    mapped the design to a netlist the netlist is translated to a gate level

                                                    description where simulation is repeated to confirm the synthesis proceeded

                                                    without errors Finally the design is laid out in the FPGA at which point

                                                    propagation delays can be added and the simulation run again with these

                                                    values back-annotated onto the netlist

                                                    Basic Process Technology Types

                                                    SRAM - based on static memory technology In-system programmable and re-

                                                    programmable Requires external boot devices CMOS

                                                    Antifuse - One-time programmable CMOS

                                                    EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                    one-time programmable in production because of plastic packaging

                                                    Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                    EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                    technology Can be erased even in plastic packages Some but not all

                                                    EEPROM devices can be in-system programmed CMOS

                                                    Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                    packages Some but not all flash devices can be in-system programmed

                                                    63

                                                    Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                    therefore less expensive to manufacture CMOS

                                                    Fuse - One-time programmable Bipolar

                                                    Major Manufacturers

                                                    Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                    rivals Together they control over 80 percent of the market with Xilinx alone

                                                    representing over 50 percent

                                                    Xilinx also provides free Windows and Linux design software while Altera

                                                    provides free Windows tools the Solaris and Linux tools are only available via

                                                    a rental scheme

                                                    Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                    (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                    Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                    focused CSSP no general purpose FPGAs)

                                                    125 FPGA prototype

                                                    FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                    prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                    hardware verification and early software development

                                                    Main stream verification methods for hardware design and early software and

                                                    firmware co-design has become mainstream Prototyping SoC and ASIC

                                                    design on FPGA has become a good method to do this

                                                    64

                                                    Reasons why Prototyping is important

                                                    1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                    functionally correct This is compared to designers only relying on software

                                                    simulations to verify that their hardware design is sound Simulation speed and

                                                    modeling accuracy limitations hinder this development

                                                    2 Due to time constrains many projects cannot wait until the silicon is back from

                                                    the foundry to start on software tests FPGA prototyping allows for much more

                                                    time in area of software development and testing at the software-hardware

                                                    integration stage This allows many unforeseen software bugs that appear due

                                                    to todays array of operating systems applications and hardware

                                                    3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                    system work well together off the simulation stage and in actual form

                                                    4 Prototyping has the added advantage as demo platforms to SoC clients

                                                    bringing in interest early This speeds up the overall development cycle and

                                                    allows for more enhancement or improvement to the chip features as it would

                                                    otherwise have been

                                                    65

                                                    13 SIMULATION RESULTS

                                                    66

                                                    GATE LEVEL

                                                    FigVGA CONTROLLER

                                                    67

                                                    GATE LEVEL

                                                    Fig SYNC UNIT

                                                    68

                                                    TECHNOLOGY SCHEMATIC

                                                    69

                                                    14 APPLICATIONS

                                                    Motion Tracking Systems

                                                    Image processing systems

                                                    Displaying Systems

                                                    141 CONCLUTION

                                                    Hence by implementing FPGA taking a processor-generated picture

                                                    (pixilated) from memory space the Controller provides digital RGB values for

                                                    each pixel as well as horizontal and vertical synchronization signals in order

                                                    to correctly display the picture on a connected monitor

                                                    70

                                                    REFERENCES

                                                    1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                    2 A VGA display controller by Eduardo Sanchez

                                                    3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                    71

                                                    • Pixel
                                                      • 6 SIGNALS IN VGA CONTROLLER
                                                      • Gates
                                                      • Market size
                                                      • 122 FPGA Comparisons

                                                      27

                                                      The RGB color code stored in the RGB register is output from the VGA

                                                      Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                                      and B1)

                                                      The monitor itself expects analog signals as inputs to its electron gun control

                                                      circuits This is achieved by using 2-bit digital to analog converter circuitry

                                                      located on the Nano Board itself as shown in Figure 4

                                                      Figure 91 Digital to Analog RGB conversion

                                                      28

                                                      For each color the 2-bit digital signal from the VGA Controller can be

                                                      converted into 4 distinct analog levels These levels specify the intensity of

                                                      each of the three primary colors to use when displaying the pixel on the

                                                      monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                                      brightness)

                                                      With each analog input being one of four possible levels the monitor

                                                      can display each pixel on the screen with one of 64 different color

                                                      permutations

                                                      Using Block RAM in Spartan-3E Generation

                                                      FPGAs

                                                      For applications requiring large on-chip memories Spartantrade-3

                                                      Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                                      Using various configuration options Select RAM blocks create RAM ROM

                                                      FIFOs large look-up tables data width converterscircular buffers and shift

                                                      registers each supporting various data widths and depths This application note

                                                      describes the features and capabilities of block Select RAM and illustrates how

                                                      to specify the various options using the Xilinx CORE Generatortrade system or

                                                      via VHDL or Verilog instantiation

                                                      29

                                                      Table 2 Block RAM Available in Spartan-3E Devices

                                                      Devi

                                                      ce

                                                      R

                                                      A

                                                      M

                                                      Co

                                                      lu

                                                      m

                                                      ns

                                                      R

                                                      A

                                                      M

                                                      Bl

                                                      oc

                                                      ks

                                                      pe

                                                      r

                                                      C

                                                      ol

                                                      u

                                                      m

                                                      n

                                                      T

                                                      o

                                                      t

                                                      a

                                                      l

                                                      R

                                                      A

                                                      M

                                                      B

                                                      l

                                                      o

                                                      c

                                                      k

                                                      s

                                                      T

                                                      o

                                                      t

                                                      a

                                                      l

                                                      R

                                                      A

                                                      M

                                                      B

                                                      it

                                                      s

                                                      T

                                                      o

                                                      t

                                                      a

                                                      l

                                                      R

                                                      A

                                                      M

                                                      K

                                                      b

                                                      i

                                                      t

                                                      s

                                                      XC3

                                                      S100

                                                      E

                                                      1 4 4 7

                                                      3

                                                      7

                                                      2

                                                      8

                                                      7

                                                      2

                                                      XC3

                                                      S250

                                                      E

                                                      2 6 1

                                                      2

                                                      2

                                                      2

                                                      1

                                                      1

                                                      8

                                                      4

                                                      2

                                                      1

                                                      6

                                                      XC3

                                                      S500

                                                      E

                                                      2 10 2

                                                      0

                                                      3

                                                      6

                                                      8

                                                      6

                                                      4

                                                      3

                                                      6

                                                      0

                                                      30

                                                      0

                                                      XC3

                                                      S120

                                                      0E

                                                      2 14 2

                                                      8

                                                      5

                                                      1

                                                      6

                                                      0

                                                      9

                                                      6

                                                      5

                                                      0

                                                      4

                                                      XC3

                                                      S160

                                                      0E

                                                      2 18 3

                                                      6

                                                      6

                                                      6

                                                      3

                                                      5

                                                      5

                                                      2

                                                      6

                                                      4

                                                      8

                                                      Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                                      which is allocated to data storage and in some memory configurations an

                                                      additional 2K bits allocated to parity or additional plus data bits Physically

                                                      the block RAM memory has two completely independent access ports labeled

                                                      Port A and Port B The structure is fully symmetrical and both ports are

                                                      interchangeable and both ports support data read and write operations Each

                                                      memory port is synchronous with its own clock clock enable and write

                                                      enable Read operations are also synchronous and require a clock edge and

                                                      clock enable Though physically a dual-port memory block RAM simulates

                                                      single-port memory in an application as shown in Figure 1 Furthermore each

                                                      block memory supports multiple configurations or aspect ratios Table 3

                                                      summarizes the essential SelectRAM features Cascade multiple block RAMs

                                                      to create deeper and wider memory organizations with a minimal timing

                                                      penalty incurred through specialized routing resources

                                                      31

                                                      32

                                                      10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                      Why (V) HDL

                                                      Interoperability

                                                      Technology independence

                                                      Design reuse

                                                      Several levels of abstraction

                                                      Readability

                                                      Standard language

                                                      Widely supported

                                                      What is VHDL

                                                      VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                      IC)

                                                      Design specification language

                                                      Design entry language

                                                      Design simulation language

                                                      Design documentation language

                                                      An alternative to schematics

                                                      101 Brief History

                                                      VHDL Was developed in the early 1980s for managing design problems that

                                                      involved large circuits and multiple teams of engineers

                                                      Funded by US Department of Defence

                                                      33

                                                      The first publicly available version was released in 1985

                                                      In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                      presented with a proposal to standardize the VHDL

                                                      In 1987 standardization =gt IEEE 1076-1987

                                                      An improved version of the language was released in 1994 =gt IEEE

                                                      standard1076-1993

                                                      102 Related Standards

                                                      IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                      impedance

                                                      Soon after IEEE 1076-1987 was released simulator companies began using

                                                      their own non-standard types =gt VHDL was becoming a nonstandard

                                                      IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                      definitions for a nine-valued data type std_logic

                                                      IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                      to actual hardware

                                                      Defines eg two numeric types signed and unsigned

                                                      VHDL Environment

                                                      Design Units

                                                      34

                                                      Segments of VHDL code that can be compiled separately and stored in a

                                                      library

                                                      Entities

                                                      A black box with interface definition

                                                      Defines the inputsoutputs of a component (define pins)

                                                      A way to represent modularity in VHDL

                                                      Similar to symbol in schematic

                                                      Entity declaration describes entity

                                                      Eg

                                                      entity Comparator is

                                                      port (A B in std_logic_vector(7 downto0)

                                                      EQ out std_logic)

                                                      end Comparator

                                                      Ports Provide channels of communication between the component and its

                                                      environment

                                                      Each port must have a name direction and a type

                                                      An entity may have NO port declaration

                                                      35

                                                      Port directions In A value of a port can be read inside the component but cannot be assigned

                                                      Multiple reads of port are allowed

                                                      Out Assignments can be made to a port but data from a port cannot be read

                                                      Multiple assignments are allowed

                                                      In out Bi-directional assignments can be made and data can be read Multiple

                                                      assignments are allowed

                                                      Buffer An out port with read capability May have at most one assignment

                                                      (are not recommended)

                                                      Architectures Every entity has at least one architecture

                                                      One entity can have several architectures

                                                      Architectures can describe design using

                                                      BehaviorndashStructurendashDataflow

                                                      Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                      Transfer Level)ndashBehavioral level

                                                      Configuration declaration links architecture to entity

                                                      Eg

                                                      Architecture Comparator1 of Comparator is

                                                      Begin

                                                      EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                      End Comparator1

                                                      Configurations Links entity declaration and architecture body together

                                                      Concept of default configuration is a bit messy in VHDL lsquo87

                                                      ndashLast architecture analyzed links to entity

                                                      Can be used to change simulation behavior without re-analyzing the VHDL

                                                      source

                                                      Complex configuration declarations are ignored in synthesis

                                                      36

                                                      Some entities can have eggate level architecture and behavioral

                                                      architecture

                                                      Are always optional

                                                      PackagesPackages contain information common to many design units

                                                      1 Package declaration

                                                      --constant declarations

                                                      ndashtype and subtype declarations

                                                      ndashfunction and procedure declarations

                                                      ndashglobal signal declarations

                                                      ndashfile declarations

                                                      ndashcomponent declarations

                                                      2 Package body

                                                      ndashis not necessary needed

                                                      ndashfunction bodies

                                                      ndashprocedure bodies

                                                      Packages are meant for encapsuling data which can be shared globally among

                                                      Several design units

                                                      Consists of declaration part and optional body part

                                                      Package declaration can contain

                                                      ndashtype and subtype declarations

                                                      ndashsubprograms

                                                      ndashconstants

                                                      Alias declarations

                                                      ndashglobal signal declarations

                                                      ndashfile declarations

                                                      ndashcomponent declarations

                                                      37

                                                      Package body consists of

                                                      ndashsubprogram declarations and bodies

                                                      ndashtype and subtype declarations

                                                      ndash deferred constants

                                                      ndash file declarations

                                                      LibrariesCollection of VHDL design units (database)

                                                      1 Packages

                                                      package declaration

                                                      package body

                                                      2 Entities (entity declaration)

                                                      3 Architectures (architecture body)

                                                      4 Configurations (configuration declarations)

                                                      Usually directory in UNIX file system

                                                      Can be also any other kind of database

                                                      Levels of Abstraction

                                                      VHDL supports many possible styles of design description which differ

                                                      primarily in how closely they relate to the HW

                                                      It is possible to describe a circuit in a number of ways

                                                      Structural-------

                                                      Dataflow ------- Higher level of abstraction

                                                      Behavioral -------

                                                      Structural VHDL description

                                                      Circuit is described in terms of its components

                                                      From a low-level description (eg transistor-level description) to a high level

                                                      description (eg block diagram)

                                                      For large circuits a low-level description quickly becomes impractical

                                                      103 Dataflow VHDL Description

                                                      38

                                                      Circuit is described in terms of how data moves through the system

                                                      In the dataflow style you describe how information flows between registers in

                                                      the system

                                                      The combinational logic is described at a relatively high level the placement

                                                      and operation of registers is specified quite precisely

                                                      The behavior of the system over the time is defined by registers

                                                      There are no build-in registers in VHDL-language

                                                      ndashEither lower level description

                                                      ndashor behavioral description of sequential elements is needed

                                                      The lower level register descriptions must be created or obtained

                                                      If there is no 3rd party models for registers =gt you must write the behavioral

                                                      description of registers

                                                      The behavioral description can be provided in the form of

                                                      subprograms(functions or procedures)

                                                      104 Behavioral VHDL Description

                                                      Circuit is described in terms of its operation over time

                                                      39

                                                      Representation might include eg state diagrams timing diagrams and

                                                      algorithmic descriptions

                                                      The concept of time may be expressed precisely using delays (eg A lt= B

                                                      after 10 ns)

                                                      If no actual delay is used order of sequential operations is defined

                                                      In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                      timing specifications

                                                      The actual timing results depend on implementation technology and efficiency

                                                      of synthesis tool

                                                      There are a few tools for behavioral synthesis

                                                      Concurrent Vs Sequential

                                                      Processes

                                                      Basic simulation concept in VHDL

                                                      VHDL description can always be broken up to interconnected processes

                                                      Quite similar to Unix process

                                                      40

                                                      Process keyword in VHDL

                                                      Process statement is concurrent statement

                                                      Statements inside process statements are sequential statements

                                                      Process must contain either sensitivity list or wait statement(s) but NOT both

                                                      Sensitivity list or wait statement(s) contains signals which wakes process up

                                                      General format

                                                      Process [(sensitivity_list)]

                                                      process_declarative_part

                                                      Begin

                                                      process_statements

                                                      [wait_statement]

                                                      End process

                                                      41

                                                      11 VGA CONTROLLER CODE

                                                      library IEEE

                                                      use IEEESTD_LOGIC_1164ALL

                                                      use IEEESTD_LOGIC_ARITHALL

                                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                                      use IEEEnumeric_stdALL

                                                      ---- Uncomment the following library declaration if instantiating

                                                      ---- any Xilinx primitives in this code

                                                      --library UNISIM

                                                      --use UNISIMVComponentsall

                                                      entity vga_controller is port(clk50min std_logic

                                                      rstnin std_logic

                                                      hsyncout std_logic

                                                      vsyncout std_logic

                                                      redout std_logic_vector(0 downto 0)

                                                      greenout std_logic_vector(0 downto 0)

                                                      blueout std_logic_vector(0 downto 0))

                                                      end vga_controller

                                                      architecture Behavioral of vga_controller is

                                                      component memory

                                                      port (

                                                      addr IN std_logic_VECTOR(17 downto 0)

                                                      clk IN std_logic

                                                      dout OUT std_logic_VECTOR(0 downto 0)

                                                      en IN std_logic

                                                      sinit IN std_logic)

                                                      END component

                                                      42

                                                      component sync_unit port(

                                                      clkin std_logic

                                                      rstnin std_logic

                                                      hsyncout std_logic

                                                      ---horizontal synch pulse(row)

                                                      vsyncout std_logic

                                                      --vertical sync pulse(frame) 111110100000000000

                                                      v_en_regout std_logic

                                                      h_enout std_logic

                                                      v_enout std_logic

                                                      line_countinout integer

                                                      c_hsinout integer)

                                                      end component

                                                      component pixelgen port(

                                                      clkin std_logic

                                                      rstnin std_logic

                                                      hsyncin std_logic

                                                      vsyncin std_logic

                                                      h_enin std_logic

                                                      v_enin std_logic

                                                      v_en_reg in std_logic

                                                      data_inin std_logic_vector(0 downto 0)

                                                      line_countin integer

                                                      c_hsin integer

                                                      red out std_logic_vector(0 downto 0)

                                                      greenout std_logic_vector(0 downto 0)

                                                      blueout std_logic_vector(0 downto 0)

                                                      addrout std_logic_vector(17 downto 0)

                                                      rdout std_logic)

                                                      end component

                                                      43

                                                      signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                      signal line_count_sc_hs_saddrinteger

                                                      signal data_sstd_logic_vector(0 downto 0)

                                                      signal addr_sstd_logic_vector(17 downto 0)

                                                      begin

                                                      --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                      process(clk50mrstn)

                                                      begin

                                                      if(rstn = 0)then

                                                      clk lt= 0

                                                      elsif(clk50m = 1 and clk50mevent)then

                                                      clk lt= not clk

                                                      end if

                                                      --end if

                                                      end process

                                                      hsync lt= hsync_s

                                                      vsync lt= vsync_s

                                                      sync sync_unit port map(

                                                      clk =gt clk

                                                      rstn =gt rstn

                                                      hsync =gt hsync_s

                                                      vsync =gt vsync_s

                                                      v_en_reg =gt v_en_reg_s

                                                      line_count =gt line_count_s

                                                      h_en =gt h_en_s

                                                      v_en =gt v_en_s

                                                      c_hs =gt c_hs_s

                                                      )

                                                      44

                                                      pixel pixelgen port map(

                                                      clk =gt clk

                                                      rstn =gt rstn

                                                      hsync =gt hsync_s

                                                      vsync =gt vsync_s

                                                      h_en =gt h_en_s

                                                      v_en =gt v_en_s

                                                      v_en_reg =gt v_en_reg_s

                                                      data_in =gt data_s

                                                      line_count=gt line_count_s

                                                      c_hs =gt c_hs_s

                                                      red =gt red

                                                      green =gt green

                                                      blue =gt blue

                                                      addr =gt addr_s

                                                      rd =gt rd_s

                                                      )

                                                      mem12 memory port map(

                                                      addr =gt addr_s

                                                      clk =gt clk

                                                      dout =gt data_s

                                                      en =gt rd_s

                                                      sinit=gt rstn

                                                      )

                                                      end Behavioral

                                                      45

                                                      SYNCHRONIZATION BLOCK CODE

                                                      library IEEE

                                                      use IEEESTD_LOGIC_1164ALL

                                                      use IEEESTD_LOGIC_ARITHALL

                                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                                      entity sync_unit is port(

                                                      clkin std_logic

                                                      rstnin std_logic

                                                      hsyncout std_logic ---horizontal synch pulse(row)

                                                      vsyncout std_logic --vertical sync

                                                      pulse(frame) 111110100000000000

                                                      v_en_regout std_logic

                                                      h_enout std_logic

                                                      v_enout std_logic

                                                      line_countout integer

                                                      c_hsout integer )

                                                      end sync_unit

                                                      architecture arch_sync_unit of sync_unit is

                                                      signal h_en_sv_en_sstd_logic

                                                      signal line_count_sc_hs_sinteger

                                                      begin

                                                      c_hs lt= c_hs_s

                                                      line_count lt= line_count_s

                                                      h_en lt= h_en_s

                                                      v_en lt= v_en_s

                                                      -- hsync counter

                                                      process(clkrstn)

                                                      begin

                                                      if(rstn = 0)then

                                                      46

                                                      c_hs_s lt= 0

                                                      else

                                                      if(clk = 1 and clkevent)then

                                                      if(c_hs_s=793)then

                                                      c_hs_s lt= 0

                                                      else

                                                      c_hs_s lt= c_hs_s +1

                                                      end if

                                                      end if

                                                      end if

                                                      end process

                                                      --vsync counter

                                                      --vertical line counter

                                                      process(h_en_srstn)

                                                      begin

                                                      if(rstn = 0)then

                                                      line_count_s lt= 0

                                                      elsif(h_en_s = 1 and h_en_sevent)then

                                                      line_count_s lt= line_count_s + 1

                                                      if (line_count_s=515)then

                                                      line_count_s lt= 0

                                                      end if

                                                      end if

                                                      --end if

                                                      end process

                                                      --hysnc pulse

                                                      process(clkrstn)

                                                      begin

                                                      if(rstn = 0)then

                                                      47

                                                      hsync lt= 1

                                                      elsif(clk = 1 and clkevent)then

                                                      if(c_hs_s lt=95)then

                                                      hsync lt= 0

                                                      else

                                                      hsync lt= 1

                                                      end if

                                                      end if

                                                      --end if

                                                      end process

                                                      process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                      this period)

                                                      begin

                                                      if(rstn = 0)then

                                                      h_en_s lt= 0

                                                      elsif(clk = 1 and clkevent)then

                                                      if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                      h_en_s lt= 1

                                                      else

                                                      h_en_s lt= 0

                                                      end if

                                                      end if

                                                      --end if

                                                      end process

                                                      ---vysnc pulse

                                                      process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                      begin

                                                      if(rstn = 0)then

                                                      48

                                                      v_en_s lt= 0

                                                      elsif(clk = 1 and clkevent)then

                                                      if((line_count_sgt=35 and line_count_slt=515))then

                                                      v_en_s lt= 1

                                                      else

                                                      v_en_s lt= 0

                                                      end if

                                                      end if

                                                      --end if

                                                      end process

                                                      process(clkrstn) --Vertical Sync Front porch Back Porch

                                                      begin

                                                      if(rstn = 0)then

                                                      vsync lt= 1

                                                      elsif(clk = 1 and clkevent)then

                                                      if(line_count_s lt2)then

                                                      vsync lt= 0

                                                      else

                                                      vsync lt= 1

                                                      end if

                                                      end if

                                                      --end if

                                                      end process

                                                      process(clk)

                                                      begin

                                                      if(clk = 1 and clkevent)then

                                                      v_en_reglt=v_en_s

                                                      end if

                                                      end process

                                                      end arch_sync_unit

                                                      49

                                                      PIXEL GENERATOR

                                                      library IEEE

                                                      use IEEESTD_LOGIC_1164ALL

                                                      use IEEESTD_LOGIC_ARITHALL

                                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                                      entity pixelgen is port(

                                                      clkin std_logic

                                                      rstnin std_logic

                                                      hsyncin std_logic

                                                      vsyncin std_logic

                                                      h_enin std_logic

                                                      v_enin std_logic

                                                      v_en_reg in std_logic

                                                      data_inin std_logic_vector(0 downto 0)

                                                      line_countin integer

                                                      c_hsin integer

                                                      red out std_logic_vector(0 downto 0)

                                                      greenout std_logic_vector(0 downto 0)

                                                      blueout std_logic_vector(0 downto 0)

                                                      addrout std_logic_vector(17 downto 0)

                                                      rdout std_logic

                                                      )

                                                      end pixelgen

                                                      architecture behv of pixelgen is

                                                      signal addr_sstd_logic_vector(17 downto 0)

                                                      begin

                                                      addr lt= addr_s

                                                      process(clkrstn)

                                                      begin

                                                      50

                                                      if(rstn = 0)then

                                                      rd lt= 0

                                                      red lt= 0

                                                      green lt= 0

                                                      blue lt= 0

                                                      addr_s lt= x0000 amp 00

                                                      elsif (clk = 1 and clkevent)then

                                                      if((v_en=1) and (v_en_reg=0))then

                                                      addr_slt=(others =gt0)

                                                      end if

                                                      if(v_en=1 and h_en=1)then

                                                      if(line_countlt=514)then

                                                      if(c_hs lt= 655)then

                                                      rd lt= 1

                                                      red lt= data_in

                                                      green lt= data_in

                                                      blue lt= data_in

                                                      addr_s lt=addr_s +000000000000000001

                                                      else

                                                      rd lt= 0

                                                      red lt= 0

                                                      green lt= 1

                                                      blue lt= 0

                                                      end if

                                                      else

                                                      rd lt= 0

                                                      red lt= 1

                                                      green lt= 1

                                                      blue lt= 1

                                                      51

                                                      end if

                                                      if(addr_s=111011111111111111)then

                                                      addr_slt= (others =gt0)

                                                      end if

                                                      else

                                                      red lt= 0

                                                      green lt= 0

                                                      blue lt= 0

                                                      end if

                                                      end if

                                                      -- end if

                                                      --end if

                                                      end process

                                                      end behv

                                                      52

                                                      12 FPGA ARCHITECTURE

                                                      A field-programmable gate array (FPGA) is a semiconductor device that can

                                                      be configured by the customer or designer after manufacturingmdashhence the

                                                      name field-programmable FPGAs are programmed using a logic circuit

                                                      diagram or a source code in a hardware description language (HDL) to specify

                                                      how the chip will work They can be used to implement any logical function

                                                      that an application-specific integrated circuit (ASIC) could perform but the

                                                      ability to update the functionality after shipping offers advantages for many

                                                      applications

                                                      FPGAs contain programmable logic components called logic blocks and a

                                                      hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                      togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                      can be configured to perform complex combinational functions or merely

                                                      simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                      include memory elements which may be simple flip-flops or more complete

                                                      blocks of memory

                                                      History The FPGA industry sprouted from programmable read only

                                                      memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                      both had the option of being programmed in batches in a factory or in the field

                                                      (field programmable) however programmable logic was hard-wired between

                                                      logic gates

                                                      Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                      first commercially viable field programmable gate array in 1985 ndash the

                                                      XC2064 The XC2064 had programmable gates and programmable

                                                      interconnects between gates the beginnings of a new technology and market

                                                      The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                      input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                      into the National Inventors Hall of Fame for his invention

                                                      53

                                                      Some of the industryrsquos foundational concepts and technologies for

                                                      programmable logic arrays gates and logic blocks are founded in patents

                                                      awarded to David W Page and LuVerne R Peterson in 1985

                                                      In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                      proposed by Steve Casselman to develop a computer that would implement

                                                      600000 reprogrammable gates Casselman was successful and the system was

                                                      awarded a patent in 1992

                                                      Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                      1990s when competitors sprouted up eroding significant market-share By

                                                      1993 Actel was serving about 18 percent of the market

                                                      The 1990s were an explosive period of time for FPGAs both in sophistication

                                                      and the volume of production In the early 1990s FPGAs were primarily used

                                                      in telecommunications and networking By the end of the decade FPGAs

                                                      found their way into consumer automotive and industrial applications

                                                      FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                      algorithm technology and FPGAs to create a sound recognition device

                                                      Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                      to decide the configuration needed to accomplish a sound recognition task

                                                      121 Modern developments

                                                      A recent trend has been to take the coarse-grained architectural

                                                      approach a step further by combining the logic blocks and interconnects of

                                                      traditional FPGAs with embedded microprocessors and related peripherals to

                                                      form a complete system on a programmable chip This work mirrors the

                                                      architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                      Group which combined a reconfigurable CPU architecture on a single chip

                                                      called the SB24 That work was done in 1982 Examples of such hybrid

                                                      technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                      which include one or more PowerPC processors embedded within the FPGAs

                                                      54

                                                      logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                      processor in combination with Atmels programmable logic architecture

                                                      An alternate approach to using hard-macro processors is to make use of soft

                                                      processor cores that are implemented within the FPGA logic (See Soft

                                                      processors below)

                                                      As previously mentioned many modern FPGAs have the ability to be

                                                      reprogrammed at run time and this is leading to the idea of reconfigurable

                                                      computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                      suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                      example of a reconfigurable soft processor implemented on FPGAs However

                                                      it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                      to a specific program

                                                      Additionally new non-FPGA architectures are beginning to emerge

                                                      Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                      hybrid approach by providing an array of processor cores and FPGA-like

                                                      programmable cores on the same chip

                                                      Gates

                                                      1987 9000 gates Xilinx

                                                      1992 600000 Naval Surface Warfare Department

                                                      Early 2000s Millions

                                                      Market size

                                                      1985 First commercial FPGA technology invented by Xilinx

                                                      1987 $14 million

                                                      ~1993 gt$385 million

                                                      2005 $19 billion

                                                      2010 estimates $275 billion

                                                      55

                                                      122 FPGA Comparisons

                                                      Historically FPGAs have been slower less energy efficient and generally

                                                      achieved less functionality than their fixed ASIC counterparts A combination

                                                      of volume fabrication improvements research and development and the IO

                                                      capabilities of new supercomputers have largely closed the performance gap

                                                      between ASICs and FPGAs

                                                      Advantages include a shorter time to market ability to re-program in the field

                                                      to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                      middle road by developing their hardware on ordinary FPGAs but

                                                      manufacture their final version so it can no longer be modified after the design

                                                      has been committed

                                                      Xilinx claims that several market and technology dynamics are changing the

                                                      ASICFPGA paradigm

                                                      IC costs are rising aggressively

                                                      ASIC complexity has bolstered development time and costs

                                                      RampD resources and headcount is decreasing

                                                      Revenue losses for slow time-to-market are increasing

                                                      Financial constraints in a poor economy are driving low-cost technologies

                                                      These trends make FPGAs a better alternative than ASICs for a growing

                                                      number of higher-volume applications than they have been historically used

                                                      for which the company blames for the growing number of FPGA design starts

                                                      (see History)

                                                      The primary differences between CPLDs and FPGAs are architectural A

                                                      CPLD has a somewhat restrictive structure consisting of one or more

                                                      programmable sum-of-products logic arrays feeding a relatively small number

                                                      of clocked registers The result of this is less flexibility with the advantage of

                                                      more predictable timing delays and a higher logic-to-interconnect ratio The

                                                      FPGA architectures on the other hand are dominated by interconnect This

                                                      56

                                                      makes them far more flexible (in terms of the range of designs that are

                                                      practical for implementation within them) but also far more complex to design

                                                      for

                                                      Another notable difference between CPLDs and FPGAs is the presence in most

                                                      FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                      and embedded memories as well as to have logic blocks implement decoders

                                                      or mathematical functions

                                                      Some FPGAs have the capability of partial re-configuration that lets one

                                                      portion of the device be re-programmed while other portions continue running

                                                      123 Applications

                                                      Applications of FPGAs include digital signal processing software-defined

                                                      radio aerospace and defense systems ASIC prototyping medical imaging

                                                      computer vision speech recognition cryptography bioinformatics computer

                                                      hardware emulation radio astronomy and a growing range of other areas

                                                      FPGAs originally began as competitors to CPLDs and competed in a similar

                                                      space that of glue logic for PCBs As their size capabilities and speed

                                                      increased they began to take over larger and larger functions to the state where

                                                      some are now marketed as full systems on chips (SoC) Particularly with the

                                                      introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                      applications which had traditionally been the sole reserve of DSPs began to

                                                      incorporate FPGAs instead

                                                      FPGAs especially find applications in any area or algorithm that can make use

                                                      of the massive parallelism offered by their architecture One such area is code

                                                      breaking in particular brute-force attack of cryptographic algorithms

                                                      FPGAs are increasingly used in conventional high performance computing

                                                      applications where computational kernels such as FFT or Convolution are

                                                      performed on the FPGA instead of a microprocessor

                                                      57

                                                      The inherent parallelism of the logic resources on an FPGA allows for

                                                      considerable computational throughput even at a low MHz clock rates The

                                                      flexibility of the FPGA allows for even higher performance by trading off

                                                      precision and range in the number format for an increased number of parallel

                                                      arithmetic units This has driven a new type of processing called

                                                      reconfigurable computing where time intensive tasks are offloaded from

                                                      software to FPGAs

                                                      The adoption of FPGAs in high performance computing is currently limited by

                                                      the complexity of FPGA design compared to conventional software and the

                                                      extremely long turn-around times of current design tools where 4-8 hours wait

                                                      is necessary after even minor changes to the source code

                                                      Traditionally FPGAs have been reserved for specific vertical applications

                                                      where the volume of production is small For these low-volume applications

                                                      the premium that companies pay in hardware costs per unit for a

                                                      programmable chip is more affordable than the development resources spent

                                                      on creating an ASIC for a low-volume application Today new cost and

                                                      performance dynamics have broadened the range of viable applications

                                                      Architecture

                                                      The most common FPGA architecture consists of an array of configurable

                                                      logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                      channels have the same width (number of wires) Multiple IO pads may fit

                                                      into the height of one row or the width of one column in the array

                                                      An application circuit must be mapped into an FPGA with adequate resources

                                                      While the number of CLBs and IOs required is easily determined from the

                                                      design the number of routing tracks needed may vary considerably even

                                                      among designs with the same amount of logic (For example a crossbar switch

                                                      requires much more routing than a systolic array with the same gate count)

                                                      Since unused routing tracks increase the cost (and decrease the performance)

                                                      of the part without providing any benefit FPGA manufacturers try to provide

                                                      58

                                                      just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                      can be routed This is determined by estimates such as those derived from

                                                      Rents rule or by experiments with existing designs

                                                      The FPGA is an array or island-style FPGA It consists of an array of logic

                                                      blocks and routing channels Two IO pads fit into the height of one row or the

                                                      width of one column as shown below All the routing channels have the same

                                                      width (number of wires)

                                                      121 FPGA structure

                                                      A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                      flip-flop as shown below In recent years manufacturers have started moving

                                                      to 6-input LUTs in their high performance parts claiming increased

                                                      performance

                                                      Typical logic block

                                                      59

                                                      There is only one output which can be either the registered or the unregistered

                                                      LUT output The logic block has four inputs for the LUT and a clock input

                                                      Since clock signals (and often other high-fanout signals) are normally routed

                                                      via special-purpose dedicated routing networks in commercial FPGAs they

                                                      and other signals are separately managed

                                                      For this example architecture the locations of the FPGA logic block pins are

                                                      shown below

                                                      Logic Block Pin Locations

                                                      Each input is accessible from one side of the logic block while the output pin

                                                      can connect to routing wires in both the channel to the right and the channel

                                                      below the logic block

                                                      Each logic block output pin can connect to any of the wiring segments in the

                                                      channels adjacent to it

                                                      Similarly an IO pad can connect to any one of the wiring segments in the

                                                      channel adjacent to it For example an IO pad at the top of the chip can

                                                      connect to any of the W wires (where W is the channel width) in the horizontal

                                                      channel immediately below it

                                                      Generally the FPGA routing is unsegmented That is each wiring segment

                                                      spans only one logic block before it terminates in a switch box By turning on

                                                      60

                                                      some of the programmable switches within a switch box longer paths can be

                                                      constructed For higher speed interconnect some FPGA architectures use

                                                      longer routing lines that span multiple logic blocks

                                                      Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                      this architecture when a wire enters a switch box there are three

                                                      programmable switches that allow it to connect to three other wires in adjacent

                                                      channel segments The pattern or topology of switches used in this

                                                      architecture is the planar or domain-based switch box topology In this switch

                                                      box topology a wire in track number one connects only to wires in track

                                                      number one in adjacent channel segments wires in track number 2 connect

                                                      only to other wires in track number 2 and so on The figure below illustrates

                                                      the connections in a switch box

                                                      Switch box topology

                                                      Modern FPGA families expand upon the above capabilities to include higher

                                                      level functionality fixed into the silicon Having these common functions

                                                      embedded into the silicon reduces the area required and gives those functions

                                                      increased speed compared to building them from primitives Examples of these

                                                      include multipliers generic DSP blocks embedded processors high speed IO

                                                      logic and embedded memories

                                                      61

                                                      FPGAs are also widely used for systems validation including pre-silicon

                                                      validation post-silicon validation and firmware development This allows chip

                                                      companies to validate their design before the chip is produced in the factory

                                                      reducing the time to market

                                                      124 FPGA Design and Programming

                                                      To define the behavior of the FPGA the user provides a hardware description

                                                      language (HDL) or a schematic design The HDL form might be easier to work

                                                      with when handling large structures because its possible to just specify them

                                                      numerically rather than having to draw every piece by hand On the other hand

                                                      schematic entry can allow for easier visualisation of a design

                                                      Then using an electronic design automation tool a technology-mapped netlist

                                                      is generated The netlist can then be fitted to the actual FPGA architecture

                                                      using a process called place-and-route usually performed by the FPGA

                                                      companys proprietary place-and-route software The user will validate the

                                                      map place and route results via timing analysis simulation and other

                                                      verification methodologies Once the design and validation process is

                                                      complete the binary file generated (also using the FPGA companys

                                                      proprietary software) is used to (re)configure the FPGA

                                                      Going from schematicHDL source files to actual configuration The source

                                                      files are fed to a software suite from the FPGACPLD vendor that through

                                                      different steps will produce a file This file is then transferred to the

                                                      FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                      like an EEPROM

                                                      The most common HDLs are VHDL and Verilog although in an attempt to

                                                      reduce the complexity of designing in HDLs which have been compared to the

                                                      equivalent of assembly languages there are moves to raise the abstraction level

                                                      through the introduction of alternative languages

                                                      62

                                                      To simplify the design of complex systems in FPGAs there exist libraries of

                                                      predefined complex functions and circuits that have been tested and optimized

                                                      to speed up the design process These predefined circuits are commonly called

                                                      IP cores and are available from FPGA vendors and third-party IP suppliers

                                                      (rarely free and typically released under proprietary licenses) Other

                                                      predefined circuits are available from developer communities such as Open

                                                      Cores (typically free and released under the GPL BSD or similar license) and

                                                      other sources

                                                      In a typical design flow an FPGA application developer will simulate the

                                                      design at multiple stages throughout the design process Initially the RTL

                                                      description in VHDL or Verilog is simulated by creating test benches to

                                                      simulate the system and observe results Then after the synthesis engine has

                                                      mapped the design to a netlist the netlist is translated to a gate level

                                                      description where simulation is repeated to confirm the synthesis proceeded

                                                      without errors Finally the design is laid out in the FPGA at which point

                                                      propagation delays can be added and the simulation run again with these

                                                      values back-annotated onto the netlist

                                                      Basic Process Technology Types

                                                      SRAM - based on static memory technology In-system programmable and re-

                                                      programmable Requires external boot devices CMOS

                                                      Antifuse - One-time programmable CMOS

                                                      EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                      one-time programmable in production because of plastic packaging

                                                      Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                      EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                      technology Can be erased even in plastic packages Some but not all

                                                      EEPROM devices can be in-system programmed CMOS

                                                      Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                      packages Some but not all flash devices can be in-system programmed

                                                      63

                                                      Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                      therefore less expensive to manufacture CMOS

                                                      Fuse - One-time programmable Bipolar

                                                      Major Manufacturers

                                                      Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                      rivals Together they control over 80 percent of the market with Xilinx alone

                                                      representing over 50 percent

                                                      Xilinx also provides free Windows and Linux design software while Altera

                                                      provides free Windows tools the Solaris and Linux tools are only available via

                                                      a rental scheme

                                                      Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                      (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                      Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                      focused CSSP no general purpose FPGAs)

                                                      125 FPGA prototype

                                                      FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                      prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                      hardware verification and early software development

                                                      Main stream verification methods for hardware design and early software and

                                                      firmware co-design has become mainstream Prototyping SoC and ASIC

                                                      design on FPGA has become a good method to do this

                                                      64

                                                      Reasons why Prototyping is important

                                                      1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                      functionally correct This is compared to designers only relying on software

                                                      simulations to verify that their hardware design is sound Simulation speed and

                                                      modeling accuracy limitations hinder this development

                                                      2 Due to time constrains many projects cannot wait until the silicon is back from

                                                      the foundry to start on software tests FPGA prototyping allows for much more

                                                      time in area of software development and testing at the software-hardware

                                                      integration stage This allows many unforeseen software bugs that appear due

                                                      to todays array of operating systems applications and hardware

                                                      3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                      system work well together off the simulation stage and in actual form

                                                      4 Prototyping has the added advantage as demo platforms to SoC clients

                                                      bringing in interest early This speeds up the overall development cycle and

                                                      allows for more enhancement or improvement to the chip features as it would

                                                      otherwise have been

                                                      65

                                                      13 SIMULATION RESULTS

                                                      66

                                                      GATE LEVEL

                                                      FigVGA CONTROLLER

                                                      67

                                                      GATE LEVEL

                                                      Fig SYNC UNIT

                                                      68

                                                      TECHNOLOGY SCHEMATIC

                                                      69

                                                      14 APPLICATIONS

                                                      Motion Tracking Systems

                                                      Image processing systems

                                                      Displaying Systems

                                                      141 CONCLUTION

                                                      Hence by implementing FPGA taking a processor-generated picture

                                                      (pixilated) from memory space the Controller provides digital RGB values for

                                                      each pixel as well as horizontal and vertical synchronization signals in order

                                                      to correctly display the picture on a connected monitor

                                                      70

                                                      REFERENCES

                                                      1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                      2 A VGA display controller by Eduardo Sanchez

                                                      3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                      71

                                                      • Pixel
                                                        • 6 SIGNALS IN VGA CONTROLLER
                                                        • Gates
                                                        • Market size
                                                        • 122 FPGA Comparisons

                                                        The RGB color code stored in the RGB register is output from the VGA

                                                        Controller as separate 2-bit R G and B values (outputs R0 R1 G0 G1 B0

                                                        and B1)

                                                        The monitor itself expects analog signals as inputs to its electron gun control

                                                        circuits This is achieved by using 2-bit digital to analog converter circuitry

                                                        located on the Nano Board itself as shown in Figure 4

                                                        Figure 91 Digital to Analog RGB conversion

                                                        28

                                                        For each color the 2-bit digital signal from the VGA Controller can be

                                                        converted into 4 distinct analog levels These levels specify the intensity of

                                                        each of the three primary colors to use when displaying the pixel on the

                                                        monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                                        brightness)

                                                        With each analog input being one of four possible levels the monitor

                                                        can display each pixel on the screen with one of 64 different color

                                                        permutations

                                                        Using Block RAM in Spartan-3E Generation

                                                        FPGAs

                                                        For applications requiring large on-chip memories Spartantrade-3

                                                        Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                                        Using various configuration options Select RAM blocks create RAM ROM

                                                        FIFOs large look-up tables data width converterscircular buffers and shift

                                                        registers each supporting various data widths and depths This application note

                                                        describes the features and capabilities of block Select RAM and illustrates how

                                                        to specify the various options using the Xilinx CORE Generatortrade system or

                                                        via VHDL or Verilog instantiation

                                                        29

                                                        Table 2 Block RAM Available in Spartan-3E Devices

                                                        Devi

                                                        ce

                                                        R

                                                        A

                                                        M

                                                        Co

                                                        lu

                                                        m

                                                        ns

                                                        R

                                                        A

                                                        M

                                                        Bl

                                                        oc

                                                        ks

                                                        pe

                                                        r

                                                        C

                                                        ol

                                                        u

                                                        m

                                                        n

                                                        T

                                                        o

                                                        t

                                                        a

                                                        l

                                                        R

                                                        A

                                                        M

                                                        B

                                                        l

                                                        o

                                                        c

                                                        k

                                                        s

                                                        T

                                                        o

                                                        t

                                                        a

                                                        l

                                                        R

                                                        A

                                                        M

                                                        B

                                                        it

                                                        s

                                                        T

                                                        o

                                                        t

                                                        a

                                                        l

                                                        R

                                                        A

                                                        M

                                                        K

                                                        b

                                                        i

                                                        t

                                                        s

                                                        XC3

                                                        S100

                                                        E

                                                        1 4 4 7

                                                        3

                                                        7

                                                        2

                                                        8

                                                        7

                                                        2

                                                        XC3

                                                        S250

                                                        E

                                                        2 6 1

                                                        2

                                                        2

                                                        2

                                                        1

                                                        1

                                                        8

                                                        4

                                                        2

                                                        1

                                                        6

                                                        XC3

                                                        S500

                                                        E

                                                        2 10 2

                                                        0

                                                        3

                                                        6

                                                        8

                                                        6

                                                        4

                                                        3

                                                        6

                                                        0

                                                        30

                                                        0

                                                        XC3

                                                        S120

                                                        0E

                                                        2 14 2

                                                        8

                                                        5

                                                        1

                                                        6

                                                        0

                                                        9

                                                        6

                                                        5

                                                        0

                                                        4

                                                        XC3

                                                        S160

                                                        0E

                                                        2 18 3

                                                        6

                                                        6

                                                        6

                                                        3

                                                        5

                                                        5

                                                        2

                                                        6

                                                        4

                                                        8

                                                        Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                                        which is allocated to data storage and in some memory configurations an

                                                        additional 2K bits allocated to parity or additional plus data bits Physically

                                                        the block RAM memory has two completely independent access ports labeled

                                                        Port A and Port B The structure is fully symmetrical and both ports are

                                                        interchangeable and both ports support data read and write operations Each

                                                        memory port is synchronous with its own clock clock enable and write

                                                        enable Read operations are also synchronous and require a clock edge and

                                                        clock enable Though physically a dual-port memory block RAM simulates

                                                        single-port memory in an application as shown in Figure 1 Furthermore each

                                                        block memory supports multiple configurations or aspect ratios Table 3

                                                        summarizes the essential SelectRAM features Cascade multiple block RAMs

                                                        to create deeper and wider memory organizations with a minimal timing

                                                        penalty incurred through specialized routing resources

                                                        31

                                                        32

                                                        10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                        Why (V) HDL

                                                        Interoperability

                                                        Technology independence

                                                        Design reuse

                                                        Several levels of abstraction

                                                        Readability

                                                        Standard language

                                                        Widely supported

                                                        What is VHDL

                                                        VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                        IC)

                                                        Design specification language

                                                        Design entry language

                                                        Design simulation language

                                                        Design documentation language

                                                        An alternative to schematics

                                                        101 Brief History

                                                        VHDL Was developed in the early 1980s for managing design problems that

                                                        involved large circuits and multiple teams of engineers

                                                        Funded by US Department of Defence

                                                        33

                                                        The first publicly available version was released in 1985

                                                        In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                        presented with a proposal to standardize the VHDL

                                                        In 1987 standardization =gt IEEE 1076-1987

                                                        An improved version of the language was released in 1994 =gt IEEE

                                                        standard1076-1993

                                                        102 Related Standards

                                                        IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                        impedance

                                                        Soon after IEEE 1076-1987 was released simulator companies began using

                                                        their own non-standard types =gt VHDL was becoming a nonstandard

                                                        IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                        definitions for a nine-valued data type std_logic

                                                        IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                        to actual hardware

                                                        Defines eg two numeric types signed and unsigned

                                                        VHDL Environment

                                                        Design Units

                                                        34

                                                        Segments of VHDL code that can be compiled separately and stored in a

                                                        library

                                                        Entities

                                                        A black box with interface definition

                                                        Defines the inputsoutputs of a component (define pins)

                                                        A way to represent modularity in VHDL

                                                        Similar to symbol in schematic

                                                        Entity declaration describes entity

                                                        Eg

                                                        entity Comparator is

                                                        port (A B in std_logic_vector(7 downto0)

                                                        EQ out std_logic)

                                                        end Comparator

                                                        Ports Provide channels of communication between the component and its

                                                        environment

                                                        Each port must have a name direction and a type

                                                        An entity may have NO port declaration

                                                        35

                                                        Port directions In A value of a port can be read inside the component but cannot be assigned

                                                        Multiple reads of port are allowed

                                                        Out Assignments can be made to a port but data from a port cannot be read

                                                        Multiple assignments are allowed

                                                        In out Bi-directional assignments can be made and data can be read Multiple

                                                        assignments are allowed

                                                        Buffer An out port with read capability May have at most one assignment

                                                        (are not recommended)

                                                        Architectures Every entity has at least one architecture

                                                        One entity can have several architectures

                                                        Architectures can describe design using

                                                        BehaviorndashStructurendashDataflow

                                                        Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                        Transfer Level)ndashBehavioral level

                                                        Configuration declaration links architecture to entity

                                                        Eg

                                                        Architecture Comparator1 of Comparator is

                                                        Begin

                                                        EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                        End Comparator1

                                                        Configurations Links entity declaration and architecture body together

                                                        Concept of default configuration is a bit messy in VHDL lsquo87

                                                        ndashLast architecture analyzed links to entity

                                                        Can be used to change simulation behavior without re-analyzing the VHDL

                                                        source

                                                        Complex configuration declarations are ignored in synthesis

                                                        36

                                                        Some entities can have eggate level architecture and behavioral

                                                        architecture

                                                        Are always optional

                                                        PackagesPackages contain information common to many design units

                                                        1 Package declaration

                                                        --constant declarations

                                                        ndashtype and subtype declarations

                                                        ndashfunction and procedure declarations

                                                        ndashglobal signal declarations

                                                        ndashfile declarations

                                                        ndashcomponent declarations

                                                        2 Package body

                                                        ndashis not necessary needed

                                                        ndashfunction bodies

                                                        ndashprocedure bodies

                                                        Packages are meant for encapsuling data which can be shared globally among

                                                        Several design units

                                                        Consists of declaration part and optional body part

                                                        Package declaration can contain

                                                        ndashtype and subtype declarations

                                                        ndashsubprograms

                                                        ndashconstants

                                                        Alias declarations

                                                        ndashglobal signal declarations

                                                        ndashfile declarations

                                                        ndashcomponent declarations

                                                        37

                                                        Package body consists of

                                                        ndashsubprogram declarations and bodies

                                                        ndashtype and subtype declarations

                                                        ndash deferred constants

                                                        ndash file declarations

                                                        LibrariesCollection of VHDL design units (database)

                                                        1 Packages

                                                        package declaration

                                                        package body

                                                        2 Entities (entity declaration)

                                                        3 Architectures (architecture body)

                                                        4 Configurations (configuration declarations)

                                                        Usually directory in UNIX file system

                                                        Can be also any other kind of database

                                                        Levels of Abstraction

                                                        VHDL supports many possible styles of design description which differ

                                                        primarily in how closely they relate to the HW

                                                        It is possible to describe a circuit in a number of ways

                                                        Structural-------

                                                        Dataflow ------- Higher level of abstraction

                                                        Behavioral -------

                                                        Structural VHDL description

                                                        Circuit is described in terms of its components

                                                        From a low-level description (eg transistor-level description) to a high level

                                                        description (eg block diagram)

                                                        For large circuits a low-level description quickly becomes impractical

                                                        103 Dataflow VHDL Description

                                                        38

                                                        Circuit is described in terms of how data moves through the system

                                                        In the dataflow style you describe how information flows between registers in

                                                        the system

                                                        The combinational logic is described at a relatively high level the placement

                                                        and operation of registers is specified quite precisely

                                                        The behavior of the system over the time is defined by registers

                                                        There are no build-in registers in VHDL-language

                                                        ndashEither lower level description

                                                        ndashor behavioral description of sequential elements is needed

                                                        The lower level register descriptions must be created or obtained

                                                        If there is no 3rd party models for registers =gt you must write the behavioral

                                                        description of registers

                                                        The behavioral description can be provided in the form of

                                                        subprograms(functions or procedures)

                                                        104 Behavioral VHDL Description

                                                        Circuit is described in terms of its operation over time

                                                        39

                                                        Representation might include eg state diagrams timing diagrams and

                                                        algorithmic descriptions

                                                        The concept of time may be expressed precisely using delays (eg A lt= B

                                                        after 10 ns)

                                                        If no actual delay is used order of sequential operations is defined

                                                        In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                        timing specifications

                                                        The actual timing results depend on implementation technology and efficiency

                                                        of synthesis tool

                                                        There are a few tools for behavioral synthesis

                                                        Concurrent Vs Sequential

                                                        Processes

                                                        Basic simulation concept in VHDL

                                                        VHDL description can always be broken up to interconnected processes

                                                        Quite similar to Unix process

                                                        40

                                                        Process keyword in VHDL

                                                        Process statement is concurrent statement

                                                        Statements inside process statements are sequential statements

                                                        Process must contain either sensitivity list or wait statement(s) but NOT both

                                                        Sensitivity list or wait statement(s) contains signals which wakes process up

                                                        General format

                                                        Process [(sensitivity_list)]

                                                        process_declarative_part

                                                        Begin

                                                        process_statements

                                                        [wait_statement]

                                                        End process

                                                        41

                                                        11 VGA CONTROLLER CODE

                                                        library IEEE

                                                        use IEEESTD_LOGIC_1164ALL

                                                        use IEEESTD_LOGIC_ARITHALL

                                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                                        use IEEEnumeric_stdALL

                                                        ---- Uncomment the following library declaration if instantiating

                                                        ---- any Xilinx primitives in this code

                                                        --library UNISIM

                                                        --use UNISIMVComponentsall

                                                        entity vga_controller is port(clk50min std_logic

                                                        rstnin std_logic

                                                        hsyncout std_logic

                                                        vsyncout std_logic

                                                        redout std_logic_vector(0 downto 0)

                                                        greenout std_logic_vector(0 downto 0)

                                                        blueout std_logic_vector(0 downto 0))

                                                        end vga_controller

                                                        architecture Behavioral of vga_controller is

                                                        component memory

                                                        port (

                                                        addr IN std_logic_VECTOR(17 downto 0)

                                                        clk IN std_logic

                                                        dout OUT std_logic_VECTOR(0 downto 0)

                                                        en IN std_logic

                                                        sinit IN std_logic)

                                                        END component

                                                        42

                                                        component sync_unit port(

                                                        clkin std_logic

                                                        rstnin std_logic

                                                        hsyncout std_logic

                                                        ---horizontal synch pulse(row)

                                                        vsyncout std_logic

                                                        --vertical sync pulse(frame) 111110100000000000

                                                        v_en_regout std_logic

                                                        h_enout std_logic

                                                        v_enout std_logic

                                                        line_countinout integer

                                                        c_hsinout integer)

                                                        end component

                                                        component pixelgen port(

                                                        clkin std_logic

                                                        rstnin std_logic

                                                        hsyncin std_logic

                                                        vsyncin std_logic

                                                        h_enin std_logic

                                                        v_enin std_logic

                                                        v_en_reg in std_logic

                                                        data_inin std_logic_vector(0 downto 0)

                                                        line_countin integer

                                                        c_hsin integer

                                                        red out std_logic_vector(0 downto 0)

                                                        greenout std_logic_vector(0 downto 0)

                                                        blueout std_logic_vector(0 downto 0)

                                                        addrout std_logic_vector(17 downto 0)

                                                        rdout std_logic)

                                                        end component

                                                        43

                                                        signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                        signal line_count_sc_hs_saddrinteger

                                                        signal data_sstd_logic_vector(0 downto 0)

                                                        signal addr_sstd_logic_vector(17 downto 0)

                                                        begin

                                                        --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                        process(clk50mrstn)

                                                        begin

                                                        if(rstn = 0)then

                                                        clk lt= 0

                                                        elsif(clk50m = 1 and clk50mevent)then

                                                        clk lt= not clk

                                                        end if

                                                        --end if

                                                        end process

                                                        hsync lt= hsync_s

                                                        vsync lt= vsync_s

                                                        sync sync_unit port map(

                                                        clk =gt clk

                                                        rstn =gt rstn

                                                        hsync =gt hsync_s

                                                        vsync =gt vsync_s

                                                        v_en_reg =gt v_en_reg_s

                                                        line_count =gt line_count_s

                                                        h_en =gt h_en_s

                                                        v_en =gt v_en_s

                                                        c_hs =gt c_hs_s

                                                        )

                                                        44

                                                        pixel pixelgen port map(

                                                        clk =gt clk

                                                        rstn =gt rstn

                                                        hsync =gt hsync_s

                                                        vsync =gt vsync_s

                                                        h_en =gt h_en_s

                                                        v_en =gt v_en_s

                                                        v_en_reg =gt v_en_reg_s

                                                        data_in =gt data_s

                                                        line_count=gt line_count_s

                                                        c_hs =gt c_hs_s

                                                        red =gt red

                                                        green =gt green

                                                        blue =gt blue

                                                        addr =gt addr_s

                                                        rd =gt rd_s

                                                        )

                                                        mem12 memory port map(

                                                        addr =gt addr_s

                                                        clk =gt clk

                                                        dout =gt data_s

                                                        en =gt rd_s

                                                        sinit=gt rstn

                                                        )

                                                        end Behavioral

                                                        45

                                                        SYNCHRONIZATION BLOCK CODE

                                                        library IEEE

                                                        use IEEESTD_LOGIC_1164ALL

                                                        use IEEESTD_LOGIC_ARITHALL

                                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                                        entity sync_unit is port(

                                                        clkin std_logic

                                                        rstnin std_logic

                                                        hsyncout std_logic ---horizontal synch pulse(row)

                                                        vsyncout std_logic --vertical sync

                                                        pulse(frame) 111110100000000000

                                                        v_en_regout std_logic

                                                        h_enout std_logic

                                                        v_enout std_logic

                                                        line_countout integer

                                                        c_hsout integer )

                                                        end sync_unit

                                                        architecture arch_sync_unit of sync_unit is

                                                        signal h_en_sv_en_sstd_logic

                                                        signal line_count_sc_hs_sinteger

                                                        begin

                                                        c_hs lt= c_hs_s

                                                        line_count lt= line_count_s

                                                        h_en lt= h_en_s

                                                        v_en lt= v_en_s

                                                        -- hsync counter

                                                        process(clkrstn)

                                                        begin

                                                        if(rstn = 0)then

                                                        46

                                                        c_hs_s lt= 0

                                                        else

                                                        if(clk = 1 and clkevent)then

                                                        if(c_hs_s=793)then

                                                        c_hs_s lt= 0

                                                        else

                                                        c_hs_s lt= c_hs_s +1

                                                        end if

                                                        end if

                                                        end if

                                                        end process

                                                        --vsync counter

                                                        --vertical line counter

                                                        process(h_en_srstn)

                                                        begin

                                                        if(rstn = 0)then

                                                        line_count_s lt= 0

                                                        elsif(h_en_s = 1 and h_en_sevent)then

                                                        line_count_s lt= line_count_s + 1

                                                        if (line_count_s=515)then

                                                        line_count_s lt= 0

                                                        end if

                                                        end if

                                                        --end if

                                                        end process

                                                        --hysnc pulse

                                                        process(clkrstn)

                                                        begin

                                                        if(rstn = 0)then

                                                        47

                                                        hsync lt= 1

                                                        elsif(clk = 1 and clkevent)then

                                                        if(c_hs_s lt=95)then

                                                        hsync lt= 0

                                                        else

                                                        hsync lt= 1

                                                        end if

                                                        end if

                                                        --end if

                                                        end process

                                                        process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                        this period)

                                                        begin

                                                        if(rstn = 0)then

                                                        h_en_s lt= 0

                                                        elsif(clk = 1 and clkevent)then

                                                        if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                        h_en_s lt= 1

                                                        else

                                                        h_en_s lt= 0

                                                        end if

                                                        end if

                                                        --end if

                                                        end process

                                                        ---vysnc pulse

                                                        process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                        begin

                                                        if(rstn = 0)then

                                                        48

                                                        v_en_s lt= 0

                                                        elsif(clk = 1 and clkevent)then

                                                        if((line_count_sgt=35 and line_count_slt=515))then

                                                        v_en_s lt= 1

                                                        else

                                                        v_en_s lt= 0

                                                        end if

                                                        end if

                                                        --end if

                                                        end process

                                                        process(clkrstn) --Vertical Sync Front porch Back Porch

                                                        begin

                                                        if(rstn = 0)then

                                                        vsync lt= 1

                                                        elsif(clk = 1 and clkevent)then

                                                        if(line_count_s lt2)then

                                                        vsync lt= 0

                                                        else

                                                        vsync lt= 1

                                                        end if

                                                        end if

                                                        --end if

                                                        end process

                                                        process(clk)

                                                        begin

                                                        if(clk = 1 and clkevent)then

                                                        v_en_reglt=v_en_s

                                                        end if

                                                        end process

                                                        end arch_sync_unit

                                                        49

                                                        PIXEL GENERATOR

                                                        library IEEE

                                                        use IEEESTD_LOGIC_1164ALL

                                                        use IEEESTD_LOGIC_ARITHALL

                                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                                        entity pixelgen is port(

                                                        clkin std_logic

                                                        rstnin std_logic

                                                        hsyncin std_logic

                                                        vsyncin std_logic

                                                        h_enin std_logic

                                                        v_enin std_logic

                                                        v_en_reg in std_logic

                                                        data_inin std_logic_vector(0 downto 0)

                                                        line_countin integer

                                                        c_hsin integer

                                                        red out std_logic_vector(0 downto 0)

                                                        greenout std_logic_vector(0 downto 0)

                                                        blueout std_logic_vector(0 downto 0)

                                                        addrout std_logic_vector(17 downto 0)

                                                        rdout std_logic

                                                        )

                                                        end pixelgen

                                                        architecture behv of pixelgen is

                                                        signal addr_sstd_logic_vector(17 downto 0)

                                                        begin

                                                        addr lt= addr_s

                                                        process(clkrstn)

                                                        begin

                                                        50

                                                        if(rstn = 0)then

                                                        rd lt= 0

                                                        red lt= 0

                                                        green lt= 0

                                                        blue lt= 0

                                                        addr_s lt= x0000 amp 00

                                                        elsif (clk = 1 and clkevent)then

                                                        if((v_en=1) and (v_en_reg=0))then

                                                        addr_slt=(others =gt0)

                                                        end if

                                                        if(v_en=1 and h_en=1)then

                                                        if(line_countlt=514)then

                                                        if(c_hs lt= 655)then

                                                        rd lt= 1

                                                        red lt= data_in

                                                        green lt= data_in

                                                        blue lt= data_in

                                                        addr_s lt=addr_s +000000000000000001

                                                        else

                                                        rd lt= 0

                                                        red lt= 0

                                                        green lt= 1

                                                        blue lt= 0

                                                        end if

                                                        else

                                                        rd lt= 0

                                                        red lt= 1

                                                        green lt= 1

                                                        blue lt= 1

                                                        51

                                                        end if

                                                        if(addr_s=111011111111111111)then

                                                        addr_slt= (others =gt0)

                                                        end if

                                                        else

                                                        red lt= 0

                                                        green lt= 0

                                                        blue lt= 0

                                                        end if

                                                        end if

                                                        -- end if

                                                        --end if

                                                        end process

                                                        end behv

                                                        52

                                                        12 FPGA ARCHITECTURE

                                                        A field-programmable gate array (FPGA) is a semiconductor device that can

                                                        be configured by the customer or designer after manufacturingmdashhence the

                                                        name field-programmable FPGAs are programmed using a logic circuit

                                                        diagram or a source code in a hardware description language (HDL) to specify

                                                        how the chip will work They can be used to implement any logical function

                                                        that an application-specific integrated circuit (ASIC) could perform but the

                                                        ability to update the functionality after shipping offers advantages for many

                                                        applications

                                                        FPGAs contain programmable logic components called logic blocks and a

                                                        hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                        togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                        can be configured to perform complex combinational functions or merely

                                                        simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                        include memory elements which may be simple flip-flops or more complete

                                                        blocks of memory

                                                        History The FPGA industry sprouted from programmable read only

                                                        memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                        both had the option of being programmed in batches in a factory or in the field

                                                        (field programmable) however programmable logic was hard-wired between

                                                        logic gates

                                                        Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                        first commercially viable field programmable gate array in 1985 ndash the

                                                        XC2064 The XC2064 had programmable gates and programmable

                                                        interconnects between gates the beginnings of a new technology and market

                                                        The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                        input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                        into the National Inventors Hall of Fame for his invention

                                                        53

                                                        Some of the industryrsquos foundational concepts and technologies for

                                                        programmable logic arrays gates and logic blocks are founded in patents

                                                        awarded to David W Page and LuVerne R Peterson in 1985

                                                        In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                        proposed by Steve Casselman to develop a computer that would implement

                                                        600000 reprogrammable gates Casselman was successful and the system was

                                                        awarded a patent in 1992

                                                        Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                        1990s when competitors sprouted up eroding significant market-share By

                                                        1993 Actel was serving about 18 percent of the market

                                                        The 1990s were an explosive period of time for FPGAs both in sophistication

                                                        and the volume of production In the early 1990s FPGAs were primarily used

                                                        in telecommunications and networking By the end of the decade FPGAs

                                                        found their way into consumer automotive and industrial applications

                                                        FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                        algorithm technology and FPGAs to create a sound recognition device

                                                        Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                        to decide the configuration needed to accomplish a sound recognition task

                                                        121 Modern developments

                                                        A recent trend has been to take the coarse-grained architectural

                                                        approach a step further by combining the logic blocks and interconnects of

                                                        traditional FPGAs with embedded microprocessors and related peripherals to

                                                        form a complete system on a programmable chip This work mirrors the

                                                        architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                        Group which combined a reconfigurable CPU architecture on a single chip

                                                        called the SB24 That work was done in 1982 Examples of such hybrid

                                                        technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                        which include one or more PowerPC processors embedded within the FPGAs

                                                        54

                                                        logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                        processor in combination with Atmels programmable logic architecture

                                                        An alternate approach to using hard-macro processors is to make use of soft

                                                        processor cores that are implemented within the FPGA logic (See Soft

                                                        processors below)

                                                        As previously mentioned many modern FPGAs have the ability to be

                                                        reprogrammed at run time and this is leading to the idea of reconfigurable

                                                        computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                        suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                        example of a reconfigurable soft processor implemented on FPGAs However

                                                        it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                        to a specific program

                                                        Additionally new non-FPGA architectures are beginning to emerge

                                                        Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                        hybrid approach by providing an array of processor cores and FPGA-like

                                                        programmable cores on the same chip

                                                        Gates

                                                        1987 9000 gates Xilinx

                                                        1992 600000 Naval Surface Warfare Department

                                                        Early 2000s Millions

                                                        Market size

                                                        1985 First commercial FPGA technology invented by Xilinx

                                                        1987 $14 million

                                                        ~1993 gt$385 million

                                                        2005 $19 billion

                                                        2010 estimates $275 billion

                                                        55

                                                        122 FPGA Comparisons

                                                        Historically FPGAs have been slower less energy efficient and generally

                                                        achieved less functionality than their fixed ASIC counterparts A combination

                                                        of volume fabrication improvements research and development and the IO

                                                        capabilities of new supercomputers have largely closed the performance gap

                                                        between ASICs and FPGAs

                                                        Advantages include a shorter time to market ability to re-program in the field

                                                        to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                        middle road by developing their hardware on ordinary FPGAs but

                                                        manufacture their final version so it can no longer be modified after the design

                                                        has been committed

                                                        Xilinx claims that several market and technology dynamics are changing the

                                                        ASICFPGA paradigm

                                                        IC costs are rising aggressively

                                                        ASIC complexity has bolstered development time and costs

                                                        RampD resources and headcount is decreasing

                                                        Revenue losses for slow time-to-market are increasing

                                                        Financial constraints in a poor economy are driving low-cost technologies

                                                        These trends make FPGAs a better alternative than ASICs for a growing

                                                        number of higher-volume applications than they have been historically used

                                                        for which the company blames for the growing number of FPGA design starts

                                                        (see History)

                                                        The primary differences between CPLDs and FPGAs are architectural A

                                                        CPLD has a somewhat restrictive structure consisting of one or more

                                                        programmable sum-of-products logic arrays feeding a relatively small number

                                                        of clocked registers The result of this is less flexibility with the advantage of

                                                        more predictable timing delays and a higher logic-to-interconnect ratio The

                                                        FPGA architectures on the other hand are dominated by interconnect This

                                                        56

                                                        makes them far more flexible (in terms of the range of designs that are

                                                        practical for implementation within them) but also far more complex to design

                                                        for

                                                        Another notable difference between CPLDs and FPGAs is the presence in most

                                                        FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                        and embedded memories as well as to have logic blocks implement decoders

                                                        or mathematical functions

                                                        Some FPGAs have the capability of partial re-configuration that lets one

                                                        portion of the device be re-programmed while other portions continue running

                                                        123 Applications

                                                        Applications of FPGAs include digital signal processing software-defined

                                                        radio aerospace and defense systems ASIC prototyping medical imaging

                                                        computer vision speech recognition cryptography bioinformatics computer

                                                        hardware emulation radio astronomy and a growing range of other areas

                                                        FPGAs originally began as competitors to CPLDs and competed in a similar

                                                        space that of glue logic for PCBs As their size capabilities and speed

                                                        increased they began to take over larger and larger functions to the state where

                                                        some are now marketed as full systems on chips (SoC) Particularly with the

                                                        introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                        applications which had traditionally been the sole reserve of DSPs began to

                                                        incorporate FPGAs instead

                                                        FPGAs especially find applications in any area or algorithm that can make use

                                                        of the massive parallelism offered by their architecture One such area is code

                                                        breaking in particular brute-force attack of cryptographic algorithms

                                                        FPGAs are increasingly used in conventional high performance computing

                                                        applications where computational kernels such as FFT or Convolution are

                                                        performed on the FPGA instead of a microprocessor

                                                        57

                                                        The inherent parallelism of the logic resources on an FPGA allows for

                                                        considerable computational throughput even at a low MHz clock rates The

                                                        flexibility of the FPGA allows for even higher performance by trading off

                                                        precision and range in the number format for an increased number of parallel

                                                        arithmetic units This has driven a new type of processing called

                                                        reconfigurable computing where time intensive tasks are offloaded from

                                                        software to FPGAs

                                                        The adoption of FPGAs in high performance computing is currently limited by

                                                        the complexity of FPGA design compared to conventional software and the

                                                        extremely long turn-around times of current design tools where 4-8 hours wait

                                                        is necessary after even minor changes to the source code

                                                        Traditionally FPGAs have been reserved for specific vertical applications

                                                        where the volume of production is small For these low-volume applications

                                                        the premium that companies pay in hardware costs per unit for a

                                                        programmable chip is more affordable than the development resources spent

                                                        on creating an ASIC for a low-volume application Today new cost and

                                                        performance dynamics have broadened the range of viable applications

                                                        Architecture

                                                        The most common FPGA architecture consists of an array of configurable

                                                        logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                        channels have the same width (number of wires) Multiple IO pads may fit

                                                        into the height of one row or the width of one column in the array

                                                        An application circuit must be mapped into an FPGA with adequate resources

                                                        While the number of CLBs and IOs required is easily determined from the

                                                        design the number of routing tracks needed may vary considerably even

                                                        among designs with the same amount of logic (For example a crossbar switch

                                                        requires much more routing than a systolic array with the same gate count)

                                                        Since unused routing tracks increase the cost (and decrease the performance)

                                                        of the part without providing any benefit FPGA manufacturers try to provide

                                                        58

                                                        just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                        can be routed This is determined by estimates such as those derived from

                                                        Rents rule or by experiments with existing designs

                                                        The FPGA is an array or island-style FPGA It consists of an array of logic

                                                        blocks and routing channels Two IO pads fit into the height of one row or the

                                                        width of one column as shown below All the routing channels have the same

                                                        width (number of wires)

                                                        121 FPGA structure

                                                        A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                        flip-flop as shown below In recent years manufacturers have started moving

                                                        to 6-input LUTs in their high performance parts claiming increased

                                                        performance

                                                        Typical logic block

                                                        59

                                                        There is only one output which can be either the registered or the unregistered

                                                        LUT output The logic block has four inputs for the LUT and a clock input

                                                        Since clock signals (and often other high-fanout signals) are normally routed

                                                        via special-purpose dedicated routing networks in commercial FPGAs they

                                                        and other signals are separately managed

                                                        For this example architecture the locations of the FPGA logic block pins are

                                                        shown below

                                                        Logic Block Pin Locations

                                                        Each input is accessible from one side of the logic block while the output pin

                                                        can connect to routing wires in both the channel to the right and the channel

                                                        below the logic block

                                                        Each logic block output pin can connect to any of the wiring segments in the

                                                        channels adjacent to it

                                                        Similarly an IO pad can connect to any one of the wiring segments in the

                                                        channel adjacent to it For example an IO pad at the top of the chip can

                                                        connect to any of the W wires (where W is the channel width) in the horizontal

                                                        channel immediately below it

                                                        Generally the FPGA routing is unsegmented That is each wiring segment

                                                        spans only one logic block before it terminates in a switch box By turning on

                                                        60

                                                        some of the programmable switches within a switch box longer paths can be

                                                        constructed For higher speed interconnect some FPGA architectures use

                                                        longer routing lines that span multiple logic blocks

                                                        Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                        this architecture when a wire enters a switch box there are three

                                                        programmable switches that allow it to connect to three other wires in adjacent

                                                        channel segments The pattern or topology of switches used in this

                                                        architecture is the planar or domain-based switch box topology In this switch

                                                        box topology a wire in track number one connects only to wires in track

                                                        number one in adjacent channel segments wires in track number 2 connect

                                                        only to other wires in track number 2 and so on The figure below illustrates

                                                        the connections in a switch box

                                                        Switch box topology

                                                        Modern FPGA families expand upon the above capabilities to include higher

                                                        level functionality fixed into the silicon Having these common functions

                                                        embedded into the silicon reduces the area required and gives those functions

                                                        increased speed compared to building them from primitives Examples of these

                                                        include multipliers generic DSP blocks embedded processors high speed IO

                                                        logic and embedded memories

                                                        61

                                                        FPGAs are also widely used for systems validation including pre-silicon

                                                        validation post-silicon validation and firmware development This allows chip

                                                        companies to validate their design before the chip is produced in the factory

                                                        reducing the time to market

                                                        124 FPGA Design and Programming

                                                        To define the behavior of the FPGA the user provides a hardware description

                                                        language (HDL) or a schematic design The HDL form might be easier to work

                                                        with when handling large structures because its possible to just specify them

                                                        numerically rather than having to draw every piece by hand On the other hand

                                                        schematic entry can allow for easier visualisation of a design

                                                        Then using an electronic design automation tool a technology-mapped netlist

                                                        is generated The netlist can then be fitted to the actual FPGA architecture

                                                        using a process called place-and-route usually performed by the FPGA

                                                        companys proprietary place-and-route software The user will validate the

                                                        map place and route results via timing analysis simulation and other

                                                        verification methodologies Once the design and validation process is

                                                        complete the binary file generated (also using the FPGA companys

                                                        proprietary software) is used to (re)configure the FPGA

                                                        Going from schematicHDL source files to actual configuration The source

                                                        files are fed to a software suite from the FPGACPLD vendor that through

                                                        different steps will produce a file This file is then transferred to the

                                                        FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                        like an EEPROM

                                                        The most common HDLs are VHDL and Verilog although in an attempt to

                                                        reduce the complexity of designing in HDLs which have been compared to the

                                                        equivalent of assembly languages there are moves to raise the abstraction level

                                                        through the introduction of alternative languages

                                                        62

                                                        To simplify the design of complex systems in FPGAs there exist libraries of

                                                        predefined complex functions and circuits that have been tested and optimized

                                                        to speed up the design process These predefined circuits are commonly called

                                                        IP cores and are available from FPGA vendors and third-party IP suppliers

                                                        (rarely free and typically released under proprietary licenses) Other

                                                        predefined circuits are available from developer communities such as Open

                                                        Cores (typically free and released under the GPL BSD or similar license) and

                                                        other sources

                                                        In a typical design flow an FPGA application developer will simulate the

                                                        design at multiple stages throughout the design process Initially the RTL

                                                        description in VHDL or Verilog is simulated by creating test benches to

                                                        simulate the system and observe results Then after the synthesis engine has

                                                        mapped the design to a netlist the netlist is translated to a gate level

                                                        description where simulation is repeated to confirm the synthesis proceeded

                                                        without errors Finally the design is laid out in the FPGA at which point

                                                        propagation delays can be added and the simulation run again with these

                                                        values back-annotated onto the netlist

                                                        Basic Process Technology Types

                                                        SRAM - based on static memory technology In-system programmable and re-

                                                        programmable Requires external boot devices CMOS

                                                        Antifuse - One-time programmable CMOS

                                                        EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                        one-time programmable in production because of plastic packaging

                                                        Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                        EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                        technology Can be erased even in plastic packages Some but not all

                                                        EEPROM devices can be in-system programmed CMOS

                                                        Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                        packages Some but not all flash devices can be in-system programmed

                                                        63

                                                        Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                        therefore less expensive to manufacture CMOS

                                                        Fuse - One-time programmable Bipolar

                                                        Major Manufacturers

                                                        Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                        rivals Together they control over 80 percent of the market with Xilinx alone

                                                        representing over 50 percent

                                                        Xilinx also provides free Windows and Linux design software while Altera

                                                        provides free Windows tools the Solaris and Linux tools are only available via

                                                        a rental scheme

                                                        Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                        (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                        Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                        focused CSSP no general purpose FPGAs)

                                                        125 FPGA prototype

                                                        FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                        prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                        hardware verification and early software development

                                                        Main stream verification methods for hardware design and early software and

                                                        firmware co-design has become mainstream Prototyping SoC and ASIC

                                                        design on FPGA has become a good method to do this

                                                        64

                                                        Reasons why Prototyping is important

                                                        1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                        functionally correct This is compared to designers only relying on software

                                                        simulations to verify that their hardware design is sound Simulation speed and

                                                        modeling accuracy limitations hinder this development

                                                        2 Due to time constrains many projects cannot wait until the silicon is back from

                                                        the foundry to start on software tests FPGA prototyping allows for much more

                                                        time in area of software development and testing at the software-hardware

                                                        integration stage This allows many unforeseen software bugs that appear due

                                                        to todays array of operating systems applications and hardware

                                                        3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                        system work well together off the simulation stage and in actual form

                                                        4 Prototyping has the added advantage as demo platforms to SoC clients

                                                        bringing in interest early This speeds up the overall development cycle and

                                                        allows for more enhancement or improvement to the chip features as it would

                                                        otherwise have been

                                                        65

                                                        13 SIMULATION RESULTS

                                                        66

                                                        GATE LEVEL

                                                        FigVGA CONTROLLER

                                                        67

                                                        GATE LEVEL

                                                        Fig SYNC UNIT

                                                        68

                                                        TECHNOLOGY SCHEMATIC

                                                        69

                                                        14 APPLICATIONS

                                                        Motion Tracking Systems

                                                        Image processing systems

                                                        Displaying Systems

                                                        141 CONCLUTION

                                                        Hence by implementing FPGA taking a processor-generated picture

                                                        (pixilated) from memory space the Controller provides digital RGB values for

                                                        each pixel as well as horizontal and vertical synchronization signals in order

                                                        to correctly display the picture on a connected monitor

                                                        70

                                                        REFERENCES

                                                        1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                        2 A VGA display controller by Eduardo Sanchez

                                                        3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                        71

                                                        • Pixel
                                                          • 6 SIGNALS IN VGA CONTROLLER
                                                          • Gates
                                                          • Market size
                                                          • 122 FPGA Comparisons

                                                          For each color the 2-bit digital signal from the VGA Controller can be

                                                          converted into 4 distinct analog levels These levels specify the intensity of

                                                          each of the three primary colors to use when displaying the pixel on the

                                                          monitorrsquos screen The levels range from 0V (total darkness) to 07V (maximum

                                                          brightness)

                                                          With each analog input being one of four possible levels the monitor

                                                          can display each pixel on the screen with one of 64 different color

                                                          permutations

                                                          Using Block RAM in Spartan-3E Generation

                                                          FPGAs

                                                          For applications requiring large on-chip memories Spartantrade-3

                                                          Generation FPGAs provides plentiful efficient Select RAMtrade memory blocks

                                                          Using various configuration options Select RAM blocks create RAM ROM

                                                          FIFOs large look-up tables data width converterscircular buffers and shift

                                                          registers each supporting various data widths and depths This application note

                                                          describes the features and capabilities of block Select RAM and illustrates how

                                                          to specify the various options using the Xilinx CORE Generatortrade system or

                                                          via VHDL or Verilog instantiation

                                                          29

                                                          Table 2 Block RAM Available in Spartan-3E Devices

                                                          Devi

                                                          ce

                                                          R

                                                          A

                                                          M

                                                          Co

                                                          lu

                                                          m

                                                          ns

                                                          R

                                                          A

                                                          M

                                                          Bl

                                                          oc

                                                          ks

                                                          pe

                                                          r

                                                          C

                                                          ol

                                                          u

                                                          m

                                                          n

                                                          T

                                                          o

                                                          t

                                                          a

                                                          l

                                                          R

                                                          A

                                                          M

                                                          B

                                                          l

                                                          o

                                                          c

                                                          k

                                                          s

                                                          T

                                                          o

                                                          t

                                                          a

                                                          l

                                                          R

                                                          A

                                                          M

                                                          B

                                                          it

                                                          s

                                                          T

                                                          o

                                                          t

                                                          a

                                                          l

                                                          R

                                                          A

                                                          M

                                                          K

                                                          b

                                                          i

                                                          t

                                                          s

                                                          XC3

                                                          S100

                                                          E

                                                          1 4 4 7

                                                          3

                                                          7

                                                          2

                                                          8

                                                          7

                                                          2

                                                          XC3

                                                          S250

                                                          E

                                                          2 6 1

                                                          2

                                                          2

                                                          2

                                                          1

                                                          1

                                                          8

                                                          4

                                                          2

                                                          1

                                                          6

                                                          XC3

                                                          S500

                                                          E

                                                          2 10 2

                                                          0

                                                          3

                                                          6

                                                          8

                                                          6

                                                          4

                                                          3

                                                          6

                                                          0

                                                          30

                                                          0

                                                          XC3

                                                          S120

                                                          0E

                                                          2 14 2

                                                          8

                                                          5

                                                          1

                                                          6

                                                          0

                                                          9

                                                          6

                                                          5

                                                          0

                                                          4

                                                          XC3

                                                          S160

                                                          0E

                                                          2 18 3

                                                          6

                                                          6

                                                          6

                                                          3

                                                          5

                                                          5

                                                          2

                                                          6

                                                          4

                                                          8

                                                          Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                                          which is allocated to data storage and in some memory configurations an

                                                          additional 2K bits allocated to parity or additional plus data bits Physically

                                                          the block RAM memory has two completely independent access ports labeled

                                                          Port A and Port B The structure is fully symmetrical and both ports are

                                                          interchangeable and both ports support data read and write operations Each

                                                          memory port is synchronous with its own clock clock enable and write

                                                          enable Read operations are also synchronous and require a clock edge and

                                                          clock enable Though physically a dual-port memory block RAM simulates

                                                          single-port memory in an application as shown in Figure 1 Furthermore each

                                                          block memory supports multiple configurations or aspect ratios Table 3

                                                          summarizes the essential SelectRAM features Cascade multiple block RAMs

                                                          to create deeper and wider memory organizations with a minimal timing

                                                          penalty incurred through specialized routing resources

                                                          31

                                                          32

                                                          10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                          Why (V) HDL

                                                          Interoperability

                                                          Technology independence

                                                          Design reuse

                                                          Several levels of abstraction

                                                          Readability

                                                          Standard language

                                                          Widely supported

                                                          What is VHDL

                                                          VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                          IC)

                                                          Design specification language

                                                          Design entry language

                                                          Design simulation language

                                                          Design documentation language

                                                          An alternative to schematics

                                                          101 Brief History

                                                          VHDL Was developed in the early 1980s for managing design problems that

                                                          involved large circuits and multiple teams of engineers

                                                          Funded by US Department of Defence

                                                          33

                                                          The first publicly available version was released in 1985

                                                          In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                          presented with a proposal to standardize the VHDL

                                                          In 1987 standardization =gt IEEE 1076-1987

                                                          An improved version of the language was released in 1994 =gt IEEE

                                                          standard1076-1993

                                                          102 Related Standards

                                                          IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                          impedance

                                                          Soon after IEEE 1076-1987 was released simulator companies began using

                                                          their own non-standard types =gt VHDL was becoming a nonstandard

                                                          IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                          definitions for a nine-valued data type std_logic

                                                          IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                          to actual hardware

                                                          Defines eg two numeric types signed and unsigned

                                                          VHDL Environment

                                                          Design Units

                                                          34

                                                          Segments of VHDL code that can be compiled separately and stored in a

                                                          library

                                                          Entities

                                                          A black box with interface definition

                                                          Defines the inputsoutputs of a component (define pins)

                                                          A way to represent modularity in VHDL

                                                          Similar to symbol in schematic

                                                          Entity declaration describes entity

                                                          Eg

                                                          entity Comparator is

                                                          port (A B in std_logic_vector(7 downto0)

                                                          EQ out std_logic)

                                                          end Comparator

                                                          Ports Provide channels of communication between the component and its

                                                          environment

                                                          Each port must have a name direction and a type

                                                          An entity may have NO port declaration

                                                          35

                                                          Port directions In A value of a port can be read inside the component but cannot be assigned

                                                          Multiple reads of port are allowed

                                                          Out Assignments can be made to a port but data from a port cannot be read

                                                          Multiple assignments are allowed

                                                          In out Bi-directional assignments can be made and data can be read Multiple

                                                          assignments are allowed

                                                          Buffer An out port with read capability May have at most one assignment

                                                          (are not recommended)

                                                          Architectures Every entity has at least one architecture

                                                          One entity can have several architectures

                                                          Architectures can describe design using

                                                          BehaviorndashStructurendashDataflow

                                                          Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                          Transfer Level)ndashBehavioral level

                                                          Configuration declaration links architecture to entity

                                                          Eg

                                                          Architecture Comparator1 of Comparator is

                                                          Begin

                                                          EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                          End Comparator1

                                                          Configurations Links entity declaration and architecture body together

                                                          Concept of default configuration is a bit messy in VHDL lsquo87

                                                          ndashLast architecture analyzed links to entity

                                                          Can be used to change simulation behavior without re-analyzing the VHDL

                                                          source

                                                          Complex configuration declarations are ignored in synthesis

                                                          36

                                                          Some entities can have eggate level architecture and behavioral

                                                          architecture

                                                          Are always optional

                                                          PackagesPackages contain information common to many design units

                                                          1 Package declaration

                                                          --constant declarations

                                                          ndashtype and subtype declarations

                                                          ndashfunction and procedure declarations

                                                          ndashglobal signal declarations

                                                          ndashfile declarations

                                                          ndashcomponent declarations

                                                          2 Package body

                                                          ndashis not necessary needed

                                                          ndashfunction bodies

                                                          ndashprocedure bodies

                                                          Packages are meant for encapsuling data which can be shared globally among

                                                          Several design units

                                                          Consists of declaration part and optional body part

                                                          Package declaration can contain

                                                          ndashtype and subtype declarations

                                                          ndashsubprograms

                                                          ndashconstants

                                                          Alias declarations

                                                          ndashglobal signal declarations

                                                          ndashfile declarations

                                                          ndashcomponent declarations

                                                          37

                                                          Package body consists of

                                                          ndashsubprogram declarations and bodies

                                                          ndashtype and subtype declarations

                                                          ndash deferred constants

                                                          ndash file declarations

                                                          LibrariesCollection of VHDL design units (database)

                                                          1 Packages

                                                          package declaration

                                                          package body

                                                          2 Entities (entity declaration)

                                                          3 Architectures (architecture body)

                                                          4 Configurations (configuration declarations)

                                                          Usually directory in UNIX file system

                                                          Can be also any other kind of database

                                                          Levels of Abstraction

                                                          VHDL supports many possible styles of design description which differ

                                                          primarily in how closely they relate to the HW

                                                          It is possible to describe a circuit in a number of ways

                                                          Structural-------

                                                          Dataflow ------- Higher level of abstraction

                                                          Behavioral -------

                                                          Structural VHDL description

                                                          Circuit is described in terms of its components

                                                          From a low-level description (eg transistor-level description) to a high level

                                                          description (eg block diagram)

                                                          For large circuits a low-level description quickly becomes impractical

                                                          103 Dataflow VHDL Description

                                                          38

                                                          Circuit is described in terms of how data moves through the system

                                                          In the dataflow style you describe how information flows between registers in

                                                          the system

                                                          The combinational logic is described at a relatively high level the placement

                                                          and operation of registers is specified quite precisely

                                                          The behavior of the system over the time is defined by registers

                                                          There are no build-in registers in VHDL-language

                                                          ndashEither lower level description

                                                          ndashor behavioral description of sequential elements is needed

                                                          The lower level register descriptions must be created or obtained

                                                          If there is no 3rd party models for registers =gt you must write the behavioral

                                                          description of registers

                                                          The behavioral description can be provided in the form of

                                                          subprograms(functions or procedures)

                                                          104 Behavioral VHDL Description

                                                          Circuit is described in terms of its operation over time

                                                          39

                                                          Representation might include eg state diagrams timing diagrams and

                                                          algorithmic descriptions

                                                          The concept of time may be expressed precisely using delays (eg A lt= B

                                                          after 10 ns)

                                                          If no actual delay is used order of sequential operations is defined

                                                          In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                          timing specifications

                                                          The actual timing results depend on implementation technology and efficiency

                                                          of synthesis tool

                                                          There are a few tools for behavioral synthesis

                                                          Concurrent Vs Sequential

                                                          Processes

                                                          Basic simulation concept in VHDL

                                                          VHDL description can always be broken up to interconnected processes

                                                          Quite similar to Unix process

                                                          40

                                                          Process keyword in VHDL

                                                          Process statement is concurrent statement

                                                          Statements inside process statements are sequential statements

                                                          Process must contain either sensitivity list or wait statement(s) but NOT both

                                                          Sensitivity list or wait statement(s) contains signals which wakes process up

                                                          General format

                                                          Process [(sensitivity_list)]

                                                          process_declarative_part

                                                          Begin

                                                          process_statements

                                                          [wait_statement]

                                                          End process

                                                          41

                                                          11 VGA CONTROLLER CODE

                                                          library IEEE

                                                          use IEEESTD_LOGIC_1164ALL

                                                          use IEEESTD_LOGIC_ARITHALL

                                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                                          use IEEEnumeric_stdALL

                                                          ---- Uncomment the following library declaration if instantiating

                                                          ---- any Xilinx primitives in this code

                                                          --library UNISIM

                                                          --use UNISIMVComponentsall

                                                          entity vga_controller is port(clk50min std_logic

                                                          rstnin std_logic

                                                          hsyncout std_logic

                                                          vsyncout std_logic

                                                          redout std_logic_vector(0 downto 0)

                                                          greenout std_logic_vector(0 downto 0)

                                                          blueout std_logic_vector(0 downto 0))

                                                          end vga_controller

                                                          architecture Behavioral of vga_controller is

                                                          component memory

                                                          port (

                                                          addr IN std_logic_VECTOR(17 downto 0)

                                                          clk IN std_logic

                                                          dout OUT std_logic_VECTOR(0 downto 0)

                                                          en IN std_logic

                                                          sinit IN std_logic)

                                                          END component

                                                          42

                                                          component sync_unit port(

                                                          clkin std_logic

                                                          rstnin std_logic

                                                          hsyncout std_logic

                                                          ---horizontal synch pulse(row)

                                                          vsyncout std_logic

                                                          --vertical sync pulse(frame) 111110100000000000

                                                          v_en_regout std_logic

                                                          h_enout std_logic

                                                          v_enout std_logic

                                                          line_countinout integer

                                                          c_hsinout integer)

                                                          end component

                                                          component pixelgen port(

                                                          clkin std_logic

                                                          rstnin std_logic

                                                          hsyncin std_logic

                                                          vsyncin std_logic

                                                          h_enin std_logic

                                                          v_enin std_logic

                                                          v_en_reg in std_logic

                                                          data_inin std_logic_vector(0 downto 0)

                                                          line_countin integer

                                                          c_hsin integer

                                                          red out std_logic_vector(0 downto 0)

                                                          greenout std_logic_vector(0 downto 0)

                                                          blueout std_logic_vector(0 downto 0)

                                                          addrout std_logic_vector(17 downto 0)

                                                          rdout std_logic)

                                                          end component

                                                          43

                                                          signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                          signal line_count_sc_hs_saddrinteger

                                                          signal data_sstd_logic_vector(0 downto 0)

                                                          signal addr_sstd_logic_vector(17 downto 0)

                                                          begin

                                                          --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                          process(clk50mrstn)

                                                          begin

                                                          if(rstn = 0)then

                                                          clk lt= 0

                                                          elsif(clk50m = 1 and clk50mevent)then

                                                          clk lt= not clk

                                                          end if

                                                          --end if

                                                          end process

                                                          hsync lt= hsync_s

                                                          vsync lt= vsync_s

                                                          sync sync_unit port map(

                                                          clk =gt clk

                                                          rstn =gt rstn

                                                          hsync =gt hsync_s

                                                          vsync =gt vsync_s

                                                          v_en_reg =gt v_en_reg_s

                                                          line_count =gt line_count_s

                                                          h_en =gt h_en_s

                                                          v_en =gt v_en_s

                                                          c_hs =gt c_hs_s

                                                          )

                                                          44

                                                          pixel pixelgen port map(

                                                          clk =gt clk

                                                          rstn =gt rstn

                                                          hsync =gt hsync_s

                                                          vsync =gt vsync_s

                                                          h_en =gt h_en_s

                                                          v_en =gt v_en_s

                                                          v_en_reg =gt v_en_reg_s

                                                          data_in =gt data_s

                                                          line_count=gt line_count_s

                                                          c_hs =gt c_hs_s

                                                          red =gt red

                                                          green =gt green

                                                          blue =gt blue

                                                          addr =gt addr_s

                                                          rd =gt rd_s

                                                          )

                                                          mem12 memory port map(

                                                          addr =gt addr_s

                                                          clk =gt clk

                                                          dout =gt data_s

                                                          en =gt rd_s

                                                          sinit=gt rstn

                                                          )

                                                          end Behavioral

                                                          45

                                                          SYNCHRONIZATION BLOCK CODE

                                                          library IEEE

                                                          use IEEESTD_LOGIC_1164ALL

                                                          use IEEESTD_LOGIC_ARITHALL

                                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                                          entity sync_unit is port(

                                                          clkin std_logic

                                                          rstnin std_logic

                                                          hsyncout std_logic ---horizontal synch pulse(row)

                                                          vsyncout std_logic --vertical sync

                                                          pulse(frame) 111110100000000000

                                                          v_en_regout std_logic

                                                          h_enout std_logic

                                                          v_enout std_logic

                                                          line_countout integer

                                                          c_hsout integer )

                                                          end sync_unit

                                                          architecture arch_sync_unit of sync_unit is

                                                          signal h_en_sv_en_sstd_logic

                                                          signal line_count_sc_hs_sinteger

                                                          begin

                                                          c_hs lt= c_hs_s

                                                          line_count lt= line_count_s

                                                          h_en lt= h_en_s

                                                          v_en lt= v_en_s

                                                          -- hsync counter

                                                          process(clkrstn)

                                                          begin

                                                          if(rstn = 0)then

                                                          46

                                                          c_hs_s lt= 0

                                                          else

                                                          if(clk = 1 and clkevent)then

                                                          if(c_hs_s=793)then

                                                          c_hs_s lt= 0

                                                          else

                                                          c_hs_s lt= c_hs_s +1

                                                          end if

                                                          end if

                                                          end if

                                                          end process

                                                          --vsync counter

                                                          --vertical line counter

                                                          process(h_en_srstn)

                                                          begin

                                                          if(rstn = 0)then

                                                          line_count_s lt= 0

                                                          elsif(h_en_s = 1 and h_en_sevent)then

                                                          line_count_s lt= line_count_s + 1

                                                          if (line_count_s=515)then

                                                          line_count_s lt= 0

                                                          end if

                                                          end if

                                                          --end if

                                                          end process

                                                          --hysnc pulse

                                                          process(clkrstn)

                                                          begin

                                                          if(rstn = 0)then

                                                          47

                                                          hsync lt= 1

                                                          elsif(clk = 1 and clkevent)then

                                                          if(c_hs_s lt=95)then

                                                          hsync lt= 0

                                                          else

                                                          hsync lt= 1

                                                          end if

                                                          end if

                                                          --end if

                                                          end process

                                                          process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                          this period)

                                                          begin

                                                          if(rstn = 0)then

                                                          h_en_s lt= 0

                                                          elsif(clk = 1 and clkevent)then

                                                          if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                          h_en_s lt= 1

                                                          else

                                                          h_en_s lt= 0

                                                          end if

                                                          end if

                                                          --end if

                                                          end process

                                                          ---vysnc pulse

                                                          process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                          begin

                                                          if(rstn = 0)then

                                                          48

                                                          v_en_s lt= 0

                                                          elsif(clk = 1 and clkevent)then

                                                          if((line_count_sgt=35 and line_count_slt=515))then

                                                          v_en_s lt= 1

                                                          else

                                                          v_en_s lt= 0

                                                          end if

                                                          end if

                                                          --end if

                                                          end process

                                                          process(clkrstn) --Vertical Sync Front porch Back Porch

                                                          begin

                                                          if(rstn = 0)then

                                                          vsync lt= 1

                                                          elsif(clk = 1 and clkevent)then

                                                          if(line_count_s lt2)then

                                                          vsync lt= 0

                                                          else

                                                          vsync lt= 1

                                                          end if

                                                          end if

                                                          --end if

                                                          end process

                                                          process(clk)

                                                          begin

                                                          if(clk = 1 and clkevent)then

                                                          v_en_reglt=v_en_s

                                                          end if

                                                          end process

                                                          end arch_sync_unit

                                                          49

                                                          PIXEL GENERATOR

                                                          library IEEE

                                                          use IEEESTD_LOGIC_1164ALL

                                                          use IEEESTD_LOGIC_ARITHALL

                                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                                          entity pixelgen is port(

                                                          clkin std_logic

                                                          rstnin std_logic

                                                          hsyncin std_logic

                                                          vsyncin std_logic

                                                          h_enin std_logic

                                                          v_enin std_logic

                                                          v_en_reg in std_logic

                                                          data_inin std_logic_vector(0 downto 0)

                                                          line_countin integer

                                                          c_hsin integer

                                                          red out std_logic_vector(0 downto 0)

                                                          greenout std_logic_vector(0 downto 0)

                                                          blueout std_logic_vector(0 downto 0)

                                                          addrout std_logic_vector(17 downto 0)

                                                          rdout std_logic

                                                          )

                                                          end pixelgen

                                                          architecture behv of pixelgen is

                                                          signal addr_sstd_logic_vector(17 downto 0)

                                                          begin

                                                          addr lt= addr_s

                                                          process(clkrstn)

                                                          begin

                                                          50

                                                          if(rstn = 0)then

                                                          rd lt= 0

                                                          red lt= 0

                                                          green lt= 0

                                                          blue lt= 0

                                                          addr_s lt= x0000 amp 00

                                                          elsif (clk = 1 and clkevent)then

                                                          if((v_en=1) and (v_en_reg=0))then

                                                          addr_slt=(others =gt0)

                                                          end if

                                                          if(v_en=1 and h_en=1)then

                                                          if(line_countlt=514)then

                                                          if(c_hs lt= 655)then

                                                          rd lt= 1

                                                          red lt= data_in

                                                          green lt= data_in

                                                          blue lt= data_in

                                                          addr_s lt=addr_s +000000000000000001

                                                          else

                                                          rd lt= 0

                                                          red lt= 0

                                                          green lt= 1

                                                          blue lt= 0

                                                          end if

                                                          else

                                                          rd lt= 0

                                                          red lt= 1

                                                          green lt= 1

                                                          blue lt= 1

                                                          51

                                                          end if

                                                          if(addr_s=111011111111111111)then

                                                          addr_slt= (others =gt0)

                                                          end if

                                                          else

                                                          red lt= 0

                                                          green lt= 0

                                                          blue lt= 0

                                                          end if

                                                          end if

                                                          -- end if

                                                          --end if

                                                          end process

                                                          end behv

                                                          52

                                                          12 FPGA ARCHITECTURE

                                                          A field-programmable gate array (FPGA) is a semiconductor device that can

                                                          be configured by the customer or designer after manufacturingmdashhence the

                                                          name field-programmable FPGAs are programmed using a logic circuit

                                                          diagram or a source code in a hardware description language (HDL) to specify

                                                          how the chip will work They can be used to implement any logical function

                                                          that an application-specific integrated circuit (ASIC) could perform but the

                                                          ability to update the functionality after shipping offers advantages for many

                                                          applications

                                                          FPGAs contain programmable logic components called logic blocks and a

                                                          hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                          togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                          can be configured to perform complex combinational functions or merely

                                                          simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                          include memory elements which may be simple flip-flops or more complete

                                                          blocks of memory

                                                          History The FPGA industry sprouted from programmable read only

                                                          memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                          both had the option of being programmed in batches in a factory or in the field

                                                          (field programmable) however programmable logic was hard-wired between

                                                          logic gates

                                                          Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                          first commercially viable field programmable gate array in 1985 ndash the

                                                          XC2064 The XC2064 had programmable gates and programmable

                                                          interconnects between gates the beginnings of a new technology and market

                                                          The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                          input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                          into the National Inventors Hall of Fame for his invention

                                                          53

                                                          Some of the industryrsquos foundational concepts and technologies for

                                                          programmable logic arrays gates and logic blocks are founded in patents

                                                          awarded to David W Page and LuVerne R Peterson in 1985

                                                          In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                          proposed by Steve Casselman to develop a computer that would implement

                                                          600000 reprogrammable gates Casselman was successful and the system was

                                                          awarded a patent in 1992

                                                          Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                          1990s when competitors sprouted up eroding significant market-share By

                                                          1993 Actel was serving about 18 percent of the market

                                                          The 1990s were an explosive period of time for FPGAs both in sophistication

                                                          and the volume of production In the early 1990s FPGAs were primarily used

                                                          in telecommunications and networking By the end of the decade FPGAs

                                                          found their way into consumer automotive and industrial applications

                                                          FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                          algorithm technology and FPGAs to create a sound recognition device

                                                          Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                          to decide the configuration needed to accomplish a sound recognition task

                                                          121 Modern developments

                                                          A recent trend has been to take the coarse-grained architectural

                                                          approach a step further by combining the logic blocks and interconnects of

                                                          traditional FPGAs with embedded microprocessors and related peripherals to

                                                          form a complete system on a programmable chip This work mirrors the

                                                          architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                          Group which combined a reconfigurable CPU architecture on a single chip

                                                          called the SB24 That work was done in 1982 Examples of such hybrid

                                                          technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                          which include one or more PowerPC processors embedded within the FPGAs

                                                          54

                                                          logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                          processor in combination with Atmels programmable logic architecture

                                                          An alternate approach to using hard-macro processors is to make use of soft

                                                          processor cores that are implemented within the FPGA logic (See Soft

                                                          processors below)

                                                          As previously mentioned many modern FPGAs have the ability to be

                                                          reprogrammed at run time and this is leading to the idea of reconfigurable

                                                          computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                          suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                          example of a reconfigurable soft processor implemented on FPGAs However

                                                          it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                          to a specific program

                                                          Additionally new non-FPGA architectures are beginning to emerge

                                                          Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                          hybrid approach by providing an array of processor cores and FPGA-like

                                                          programmable cores on the same chip

                                                          Gates

                                                          1987 9000 gates Xilinx

                                                          1992 600000 Naval Surface Warfare Department

                                                          Early 2000s Millions

                                                          Market size

                                                          1985 First commercial FPGA technology invented by Xilinx

                                                          1987 $14 million

                                                          ~1993 gt$385 million

                                                          2005 $19 billion

                                                          2010 estimates $275 billion

                                                          55

                                                          122 FPGA Comparisons

                                                          Historically FPGAs have been slower less energy efficient and generally

                                                          achieved less functionality than their fixed ASIC counterparts A combination

                                                          of volume fabrication improvements research and development and the IO

                                                          capabilities of new supercomputers have largely closed the performance gap

                                                          between ASICs and FPGAs

                                                          Advantages include a shorter time to market ability to re-program in the field

                                                          to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                          middle road by developing their hardware on ordinary FPGAs but

                                                          manufacture their final version so it can no longer be modified after the design

                                                          has been committed

                                                          Xilinx claims that several market and technology dynamics are changing the

                                                          ASICFPGA paradigm

                                                          IC costs are rising aggressively

                                                          ASIC complexity has bolstered development time and costs

                                                          RampD resources and headcount is decreasing

                                                          Revenue losses for slow time-to-market are increasing

                                                          Financial constraints in a poor economy are driving low-cost technologies

                                                          These trends make FPGAs a better alternative than ASICs for a growing

                                                          number of higher-volume applications than they have been historically used

                                                          for which the company blames for the growing number of FPGA design starts

                                                          (see History)

                                                          The primary differences between CPLDs and FPGAs are architectural A

                                                          CPLD has a somewhat restrictive structure consisting of one or more

                                                          programmable sum-of-products logic arrays feeding a relatively small number

                                                          of clocked registers The result of this is less flexibility with the advantage of

                                                          more predictable timing delays and a higher logic-to-interconnect ratio The

                                                          FPGA architectures on the other hand are dominated by interconnect This

                                                          56

                                                          makes them far more flexible (in terms of the range of designs that are

                                                          practical for implementation within them) but also far more complex to design

                                                          for

                                                          Another notable difference between CPLDs and FPGAs is the presence in most

                                                          FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                          and embedded memories as well as to have logic blocks implement decoders

                                                          or mathematical functions

                                                          Some FPGAs have the capability of partial re-configuration that lets one

                                                          portion of the device be re-programmed while other portions continue running

                                                          123 Applications

                                                          Applications of FPGAs include digital signal processing software-defined

                                                          radio aerospace and defense systems ASIC prototyping medical imaging

                                                          computer vision speech recognition cryptography bioinformatics computer

                                                          hardware emulation radio astronomy and a growing range of other areas

                                                          FPGAs originally began as competitors to CPLDs and competed in a similar

                                                          space that of glue logic for PCBs As their size capabilities and speed

                                                          increased they began to take over larger and larger functions to the state where

                                                          some are now marketed as full systems on chips (SoC) Particularly with the

                                                          introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                          applications which had traditionally been the sole reserve of DSPs began to

                                                          incorporate FPGAs instead

                                                          FPGAs especially find applications in any area or algorithm that can make use

                                                          of the massive parallelism offered by their architecture One such area is code

                                                          breaking in particular brute-force attack of cryptographic algorithms

                                                          FPGAs are increasingly used in conventional high performance computing

                                                          applications where computational kernels such as FFT or Convolution are

                                                          performed on the FPGA instead of a microprocessor

                                                          57

                                                          The inherent parallelism of the logic resources on an FPGA allows for

                                                          considerable computational throughput even at a low MHz clock rates The

                                                          flexibility of the FPGA allows for even higher performance by trading off

                                                          precision and range in the number format for an increased number of parallel

                                                          arithmetic units This has driven a new type of processing called

                                                          reconfigurable computing where time intensive tasks are offloaded from

                                                          software to FPGAs

                                                          The adoption of FPGAs in high performance computing is currently limited by

                                                          the complexity of FPGA design compared to conventional software and the

                                                          extremely long turn-around times of current design tools where 4-8 hours wait

                                                          is necessary after even minor changes to the source code

                                                          Traditionally FPGAs have been reserved for specific vertical applications

                                                          where the volume of production is small For these low-volume applications

                                                          the premium that companies pay in hardware costs per unit for a

                                                          programmable chip is more affordable than the development resources spent

                                                          on creating an ASIC for a low-volume application Today new cost and

                                                          performance dynamics have broadened the range of viable applications

                                                          Architecture

                                                          The most common FPGA architecture consists of an array of configurable

                                                          logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                          channels have the same width (number of wires) Multiple IO pads may fit

                                                          into the height of one row or the width of one column in the array

                                                          An application circuit must be mapped into an FPGA with adequate resources

                                                          While the number of CLBs and IOs required is easily determined from the

                                                          design the number of routing tracks needed may vary considerably even

                                                          among designs with the same amount of logic (For example a crossbar switch

                                                          requires much more routing than a systolic array with the same gate count)

                                                          Since unused routing tracks increase the cost (and decrease the performance)

                                                          of the part without providing any benefit FPGA manufacturers try to provide

                                                          58

                                                          just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                          can be routed This is determined by estimates such as those derived from

                                                          Rents rule or by experiments with existing designs

                                                          The FPGA is an array or island-style FPGA It consists of an array of logic

                                                          blocks and routing channels Two IO pads fit into the height of one row or the

                                                          width of one column as shown below All the routing channels have the same

                                                          width (number of wires)

                                                          121 FPGA structure

                                                          A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                          flip-flop as shown below In recent years manufacturers have started moving

                                                          to 6-input LUTs in their high performance parts claiming increased

                                                          performance

                                                          Typical logic block

                                                          59

                                                          There is only one output which can be either the registered or the unregistered

                                                          LUT output The logic block has four inputs for the LUT and a clock input

                                                          Since clock signals (and often other high-fanout signals) are normally routed

                                                          via special-purpose dedicated routing networks in commercial FPGAs they

                                                          and other signals are separately managed

                                                          For this example architecture the locations of the FPGA logic block pins are

                                                          shown below

                                                          Logic Block Pin Locations

                                                          Each input is accessible from one side of the logic block while the output pin

                                                          can connect to routing wires in both the channel to the right and the channel

                                                          below the logic block

                                                          Each logic block output pin can connect to any of the wiring segments in the

                                                          channels adjacent to it

                                                          Similarly an IO pad can connect to any one of the wiring segments in the

                                                          channel adjacent to it For example an IO pad at the top of the chip can

                                                          connect to any of the W wires (where W is the channel width) in the horizontal

                                                          channel immediately below it

                                                          Generally the FPGA routing is unsegmented That is each wiring segment

                                                          spans only one logic block before it terminates in a switch box By turning on

                                                          60

                                                          some of the programmable switches within a switch box longer paths can be

                                                          constructed For higher speed interconnect some FPGA architectures use

                                                          longer routing lines that span multiple logic blocks

                                                          Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                          this architecture when a wire enters a switch box there are three

                                                          programmable switches that allow it to connect to three other wires in adjacent

                                                          channel segments The pattern or topology of switches used in this

                                                          architecture is the planar or domain-based switch box topology In this switch

                                                          box topology a wire in track number one connects only to wires in track

                                                          number one in adjacent channel segments wires in track number 2 connect

                                                          only to other wires in track number 2 and so on The figure below illustrates

                                                          the connections in a switch box

                                                          Switch box topology

                                                          Modern FPGA families expand upon the above capabilities to include higher

                                                          level functionality fixed into the silicon Having these common functions

                                                          embedded into the silicon reduces the area required and gives those functions

                                                          increased speed compared to building them from primitives Examples of these

                                                          include multipliers generic DSP blocks embedded processors high speed IO

                                                          logic and embedded memories

                                                          61

                                                          FPGAs are also widely used for systems validation including pre-silicon

                                                          validation post-silicon validation and firmware development This allows chip

                                                          companies to validate their design before the chip is produced in the factory

                                                          reducing the time to market

                                                          124 FPGA Design and Programming

                                                          To define the behavior of the FPGA the user provides a hardware description

                                                          language (HDL) or a schematic design The HDL form might be easier to work

                                                          with when handling large structures because its possible to just specify them

                                                          numerically rather than having to draw every piece by hand On the other hand

                                                          schematic entry can allow for easier visualisation of a design

                                                          Then using an electronic design automation tool a technology-mapped netlist

                                                          is generated The netlist can then be fitted to the actual FPGA architecture

                                                          using a process called place-and-route usually performed by the FPGA

                                                          companys proprietary place-and-route software The user will validate the

                                                          map place and route results via timing analysis simulation and other

                                                          verification methodologies Once the design and validation process is

                                                          complete the binary file generated (also using the FPGA companys

                                                          proprietary software) is used to (re)configure the FPGA

                                                          Going from schematicHDL source files to actual configuration The source

                                                          files are fed to a software suite from the FPGACPLD vendor that through

                                                          different steps will produce a file This file is then transferred to the

                                                          FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                          like an EEPROM

                                                          The most common HDLs are VHDL and Verilog although in an attempt to

                                                          reduce the complexity of designing in HDLs which have been compared to the

                                                          equivalent of assembly languages there are moves to raise the abstraction level

                                                          through the introduction of alternative languages

                                                          62

                                                          To simplify the design of complex systems in FPGAs there exist libraries of

                                                          predefined complex functions and circuits that have been tested and optimized

                                                          to speed up the design process These predefined circuits are commonly called

                                                          IP cores and are available from FPGA vendors and third-party IP suppliers

                                                          (rarely free and typically released under proprietary licenses) Other

                                                          predefined circuits are available from developer communities such as Open

                                                          Cores (typically free and released under the GPL BSD or similar license) and

                                                          other sources

                                                          In a typical design flow an FPGA application developer will simulate the

                                                          design at multiple stages throughout the design process Initially the RTL

                                                          description in VHDL or Verilog is simulated by creating test benches to

                                                          simulate the system and observe results Then after the synthesis engine has

                                                          mapped the design to a netlist the netlist is translated to a gate level

                                                          description where simulation is repeated to confirm the synthesis proceeded

                                                          without errors Finally the design is laid out in the FPGA at which point

                                                          propagation delays can be added and the simulation run again with these

                                                          values back-annotated onto the netlist

                                                          Basic Process Technology Types

                                                          SRAM - based on static memory technology In-system programmable and re-

                                                          programmable Requires external boot devices CMOS

                                                          Antifuse - One-time programmable CMOS

                                                          EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                          one-time programmable in production because of plastic packaging

                                                          Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                          EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                          technology Can be erased even in plastic packages Some but not all

                                                          EEPROM devices can be in-system programmed CMOS

                                                          Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                          packages Some but not all flash devices can be in-system programmed

                                                          63

                                                          Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                          therefore less expensive to manufacture CMOS

                                                          Fuse - One-time programmable Bipolar

                                                          Major Manufacturers

                                                          Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                          rivals Together they control over 80 percent of the market with Xilinx alone

                                                          representing over 50 percent

                                                          Xilinx also provides free Windows and Linux design software while Altera

                                                          provides free Windows tools the Solaris and Linux tools are only available via

                                                          a rental scheme

                                                          Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                          (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                          Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                          focused CSSP no general purpose FPGAs)

                                                          125 FPGA prototype

                                                          FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                          prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                          hardware verification and early software development

                                                          Main stream verification methods for hardware design and early software and

                                                          firmware co-design has become mainstream Prototyping SoC and ASIC

                                                          design on FPGA has become a good method to do this

                                                          64

                                                          Reasons why Prototyping is important

                                                          1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                          functionally correct This is compared to designers only relying on software

                                                          simulations to verify that their hardware design is sound Simulation speed and

                                                          modeling accuracy limitations hinder this development

                                                          2 Due to time constrains many projects cannot wait until the silicon is back from

                                                          the foundry to start on software tests FPGA prototyping allows for much more

                                                          time in area of software development and testing at the software-hardware

                                                          integration stage This allows many unforeseen software bugs that appear due

                                                          to todays array of operating systems applications and hardware

                                                          3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                          system work well together off the simulation stage and in actual form

                                                          4 Prototyping has the added advantage as demo platforms to SoC clients

                                                          bringing in interest early This speeds up the overall development cycle and

                                                          allows for more enhancement or improvement to the chip features as it would

                                                          otherwise have been

                                                          65

                                                          13 SIMULATION RESULTS

                                                          66

                                                          GATE LEVEL

                                                          FigVGA CONTROLLER

                                                          67

                                                          GATE LEVEL

                                                          Fig SYNC UNIT

                                                          68

                                                          TECHNOLOGY SCHEMATIC

                                                          69

                                                          14 APPLICATIONS

                                                          Motion Tracking Systems

                                                          Image processing systems

                                                          Displaying Systems

                                                          141 CONCLUTION

                                                          Hence by implementing FPGA taking a processor-generated picture

                                                          (pixilated) from memory space the Controller provides digital RGB values for

                                                          each pixel as well as horizontal and vertical synchronization signals in order

                                                          to correctly display the picture on a connected monitor

                                                          70

                                                          REFERENCES

                                                          1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                          2 A VGA display controller by Eduardo Sanchez

                                                          3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                          71

                                                          • Pixel
                                                            • 6 SIGNALS IN VGA CONTROLLER
                                                            • Gates
                                                            • Market size
                                                            • 122 FPGA Comparisons

                                                            Table 2 Block RAM Available in Spartan-3E Devices

                                                            Devi

                                                            ce

                                                            R

                                                            A

                                                            M

                                                            Co

                                                            lu

                                                            m

                                                            ns

                                                            R

                                                            A

                                                            M

                                                            Bl

                                                            oc

                                                            ks

                                                            pe

                                                            r

                                                            C

                                                            ol

                                                            u

                                                            m

                                                            n

                                                            T

                                                            o

                                                            t

                                                            a

                                                            l

                                                            R

                                                            A

                                                            M

                                                            B

                                                            l

                                                            o

                                                            c

                                                            k

                                                            s

                                                            T

                                                            o

                                                            t

                                                            a

                                                            l

                                                            R

                                                            A

                                                            M

                                                            B

                                                            it

                                                            s

                                                            T

                                                            o

                                                            t

                                                            a

                                                            l

                                                            R

                                                            A

                                                            M

                                                            K

                                                            b

                                                            i

                                                            t

                                                            s

                                                            XC3

                                                            S100

                                                            E

                                                            1 4 4 7

                                                            3

                                                            7

                                                            2

                                                            8

                                                            7

                                                            2

                                                            XC3

                                                            S250

                                                            E

                                                            2 6 1

                                                            2

                                                            2

                                                            2

                                                            1

                                                            1

                                                            8

                                                            4

                                                            2

                                                            1

                                                            6

                                                            XC3

                                                            S500

                                                            E

                                                            2 10 2

                                                            0

                                                            3

                                                            6

                                                            8

                                                            6

                                                            4

                                                            3

                                                            6

                                                            0

                                                            30

                                                            0

                                                            XC3

                                                            S120

                                                            0E

                                                            2 14 2

                                                            8

                                                            5

                                                            1

                                                            6

                                                            0

                                                            9

                                                            6

                                                            5

                                                            0

                                                            4

                                                            XC3

                                                            S160

                                                            0E

                                                            2 18 3

                                                            6

                                                            6

                                                            6

                                                            3

                                                            5

                                                            5

                                                            2

                                                            6

                                                            4

                                                            8

                                                            Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                                            which is allocated to data storage and in some memory configurations an

                                                            additional 2K bits allocated to parity or additional plus data bits Physically

                                                            the block RAM memory has two completely independent access ports labeled

                                                            Port A and Port B The structure is fully symmetrical and both ports are

                                                            interchangeable and both ports support data read and write operations Each

                                                            memory port is synchronous with its own clock clock enable and write

                                                            enable Read operations are also synchronous and require a clock edge and

                                                            clock enable Though physically a dual-port memory block RAM simulates

                                                            single-port memory in an application as shown in Figure 1 Furthermore each

                                                            block memory supports multiple configurations or aspect ratios Table 3

                                                            summarizes the essential SelectRAM features Cascade multiple block RAMs

                                                            to create deeper and wider memory organizations with a minimal timing

                                                            penalty incurred through specialized routing resources

                                                            31

                                                            32

                                                            10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                            Why (V) HDL

                                                            Interoperability

                                                            Technology independence

                                                            Design reuse

                                                            Several levels of abstraction

                                                            Readability

                                                            Standard language

                                                            Widely supported

                                                            What is VHDL

                                                            VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                            IC)

                                                            Design specification language

                                                            Design entry language

                                                            Design simulation language

                                                            Design documentation language

                                                            An alternative to schematics

                                                            101 Brief History

                                                            VHDL Was developed in the early 1980s for managing design problems that

                                                            involved large circuits and multiple teams of engineers

                                                            Funded by US Department of Defence

                                                            33

                                                            The first publicly available version was released in 1985

                                                            In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                            presented with a proposal to standardize the VHDL

                                                            In 1987 standardization =gt IEEE 1076-1987

                                                            An improved version of the language was released in 1994 =gt IEEE

                                                            standard1076-1993

                                                            102 Related Standards

                                                            IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                            impedance

                                                            Soon after IEEE 1076-1987 was released simulator companies began using

                                                            their own non-standard types =gt VHDL was becoming a nonstandard

                                                            IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                            definitions for a nine-valued data type std_logic

                                                            IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                            to actual hardware

                                                            Defines eg two numeric types signed and unsigned

                                                            VHDL Environment

                                                            Design Units

                                                            34

                                                            Segments of VHDL code that can be compiled separately and stored in a

                                                            library

                                                            Entities

                                                            A black box with interface definition

                                                            Defines the inputsoutputs of a component (define pins)

                                                            A way to represent modularity in VHDL

                                                            Similar to symbol in schematic

                                                            Entity declaration describes entity

                                                            Eg

                                                            entity Comparator is

                                                            port (A B in std_logic_vector(7 downto0)

                                                            EQ out std_logic)

                                                            end Comparator

                                                            Ports Provide channels of communication between the component and its

                                                            environment

                                                            Each port must have a name direction and a type

                                                            An entity may have NO port declaration

                                                            35

                                                            Port directions In A value of a port can be read inside the component but cannot be assigned

                                                            Multiple reads of port are allowed

                                                            Out Assignments can be made to a port but data from a port cannot be read

                                                            Multiple assignments are allowed

                                                            In out Bi-directional assignments can be made and data can be read Multiple

                                                            assignments are allowed

                                                            Buffer An out port with read capability May have at most one assignment

                                                            (are not recommended)

                                                            Architectures Every entity has at least one architecture

                                                            One entity can have several architectures

                                                            Architectures can describe design using

                                                            BehaviorndashStructurendashDataflow

                                                            Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                            Transfer Level)ndashBehavioral level

                                                            Configuration declaration links architecture to entity

                                                            Eg

                                                            Architecture Comparator1 of Comparator is

                                                            Begin

                                                            EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                            End Comparator1

                                                            Configurations Links entity declaration and architecture body together

                                                            Concept of default configuration is a bit messy in VHDL lsquo87

                                                            ndashLast architecture analyzed links to entity

                                                            Can be used to change simulation behavior without re-analyzing the VHDL

                                                            source

                                                            Complex configuration declarations are ignored in synthesis

                                                            36

                                                            Some entities can have eggate level architecture and behavioral

                                                            architecture

                                                            Are always optional

                                                            PackagesPackages contain information common to many design units

                                                            1 Package declaration

                                                            --constant declarations

                                                            ndashtype and subtype declarations

                                                            ndashfunction and procedure declarations

                                                            ndashglobal signal declarations

                                                            ndashfile declarations

                                                            ndashcomponent declarations

                                                            2 Package body

                                                            ndashis not necessary needed

                                                            ndashfunction bodies

                                                            ndashprocedure bodies

                                                            Packages are meant for encapsuling data which can be shared globally among

                                                            Several design units

                                                            Consists of declaration part and optional body part

                                                            Package declaration can contain

                                                            ndashtype and subtype declarations

                                                            ndashsubprograms

                                                            ndashconstants

                                                            Alias declarations

                                                            ndashglobal signal declarations

                                                            ndashfile declarations

                                                            ndashcomponent declarations

                                                            37

                                                            Package body consists of

                                                            ndashsubprogram declarations and bodies

                                                            ndashtype and subtype declarations

                                                            ndash deferred constants

                                                            ndash file declarations

                                                            LibrariesCollection of VHDL design units (database)

                                                            1 Packages

                                                            package declaration

                                                            package body

                                                            2 Entities (entity declaration)

                                                            3 Architectures (architecture body)

                                                            4 Configurations (configuration declarations)

                                                            Usually directory in UNIX file system

                                                            Can be also any other kind of database

                                                            Levels of Abstraction

                                                            VHDL supports many possible styles of design description which differ

                                                            primarily in how closely they relate to the HW

                                                            It is possible to describe a circuit in a number of ways

                                                            Structural-------

                                                            Dataflow ------- Higher level of abstraction

                                                            Behavioral -------

                                                            Structural VHDL description

                                                            Circuit is described in terms of its components

                                                            From a low-level description (eg transistor-level description) to a high level

                                                            description (eg block diagram)

                                                            For large circuits a low-level description quickly becomes impractical

                                                            103 Dataflow VHDL Description

                                                            38

                                                            Circuit is described in terms of how data moves through the system

                                                            In the dataflow style you describe how information flows between registers in

                                                            the system

                                                            The combinational logic is described at a relatively high level the placement

                                                            and operation of registers is specified quite precisely

                                                            The behavior of the system over the time is defined by registers

                                                            There are no build-in registers in VHDL-language

                                                            ndashEither lower level description

                                                            ndashor behavioral description of sequential elements is needed

                                                            The lower level register descriptions must be created or obtained

                                                            If there is no 3rd party models for registers =gt you must write the behavioral

                                                            description of registers

                                                            The behavioral description can be provided in the form of

                                                            subprograms(functions or procedures)

                                                            104 Behavioral VHDL Description

                                                            Circuit is described in terms of its operation over time

                                                            39

                                                            Representation might include eg state diagrams timing diagrams and

                                                            algorithmic descriptions

                                                            The concept of time may be expressed precisely using delays (eg A lt= B

                                                            after 10 ns)

                                                            If no actual delay is used order of sequential operations is defined

                                                            In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                            timing specifications

                                                            The actual timing results depend on implementation technology and efficiency

                                                            of synthesis tool

                                                            There are a few tools for behavioral synthesis

                                                            Concurrent Vs Sequential

                                                            Processes

                                                            Basic simulation concept in VHDL

                                                            VHDL description can always be broken up to interconnected processes

                                                            Quite similar to Unix process

                                                            40

                                                            Process keyword in VHDL

                                                            Process statement is concurrent statement

                                                            Statements inside process statements are sequential statements

                                                            Process must contain either sensitivity list or wait statement(s) but NOT both

                                                            Sensitivity list or wait statement(s) contains signals which wakes process up

                                                            General format

                                                            Process [(sensitivity_list)]

                                                            process_declarative_part

                                                            Begin

                                                            process_statements

                                                            [wait_statement]

                                                            End process

                                                            41

                                                            11 VGA CONTROLLER CODE

                                                            library IEEE

                                                            use IEEESTD_LOGIC_1164ALL

                                                            use IEEESTD_LOGIC_ARITHALL

                                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                                            use IEEEnumeric_stdALL

                                                            ---- Uncomment the following library declaration if instantiating

                                                            ---- any Xilinx primitives in this code

                                                            --library UNISIM

                                                            --use UNISIMVComponentsall

                                                            entity vga_controller is port(clk50min std_logic

                                                            rstnin std_logic

                                                            hsyncout std_logic

                                                            vsyncout std_logic

                                                            redout std_logic_vector(0 downto 0)

                                                            greenout std_logic_vector(0 downto 0)

                                                            blueout std_logic_vector(0 downto 0))

                                                            end vga_controller

                                                            architecture Behavioral of vga_controller is

                                                            component memory

                                                            port (

                                                            addr IN std_logic_VECTOR(17 downto 0)

                                                            clk IN std_logic

                                                            dout OUT std_logic_VECTOR(0 downto 0)

                                                            en IN std_logic

                                                            sinit IN std_logic)

                                                            END component

                                                            42

                                                            component sync_unit port(

                                                            clkin std_logic

                                                            rstnin std_logic

                                                            hsyncout std_logic

                                                            ---horizontal synch pulse(row)

                                                            vsyncout std_logic

                                                            --vertical sync pulse(frame) 111110100000000000

                                                            v_en_regout std_logic

                                                            h_enout std_logic

                                                            v_enout std_logic

                                                            line_countinout integer

                                                            c_hsinout integer)

                                                            end component

                                                            component pixelgen port(

                                                            clkin std_logic

                                                            rstnin std_logic

                                                            hsyncin std_logic

                                                            vsyncin std_logic

                                                            h_enin std_logic

                                                            v_enin std_logic

                                                            v_en_reg in std_logic

                                                            data_inin std_logic_vector(0 downto 0)

                                                            line_countin integer

                                                            c_hsin integer

                                                            red out std_logic_vector(0 downto 0)

                                                            greenout std_logic_vector(0 downto 0)

                                                            blueout std_logic_vector(0 downto 0)

                                                            addrout std_logic_vector(17 downto 0)

                                                            rdout std_logic)

                                                            end component

                                                            43

                                                            signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                            signal line_count_sc_hs_saddrinteger

                                                            signal data_sstd_logic_vector(0 downto 0)

                                                            signal addr_sstd_logic_vector(17 downto 0)

                                                            begin

                                                            --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                            process(clk50mrstn)

                                                            begin

                                                            if(rstn = 0)then

                                                            clk lt= 0

                                                            elsif(clk50m = 1 and clk50mevent)then

                                                            clk lt= not clk

                                                            end if

                                                            --end if

                                                            end process

                                                            hsync lt= hsync_s

                                                            vsync lt= vsync_s

                                                            sync sync_unit port map(

                                                            clk =gt clk

                                                            rstn =gt rstn

                                                            hsync =gt hsync_s

                                                            vsync =gt vsync_s

                                                            v_en_reg =gt v_en_reg_s

                                                            line_count =gt line_count_s

                                                            h_en =gt h_en_s

                                                            v_en =gt v_en_s

                                                            c_hs =gt c_hs_s

                                                            )

                                                            44

                                                            pixel pixelgen port map(

                                                            clk =gt clk

                                                            rstn =gt rstn

                                                            hsync =gt hsync_s

                                                            vsync =gt vsync_s

                                                            h_en =gt h_en_s

                                                            v_en =gt v_en_s

                                                            v_en_reg =gt v_en_reg_s

                                                            data_in =gt data_s

                                                            line_count=gt line_count_s

                                                            c_hs =gt c_hs_s

                                                            red =gt red

                                                            green =gt green

                                                            blue =gt blue

                                                            addr =gt addr_s

                                                            rd =gt rd_s

                                                            )

                                                            mem12 memory port map(

                                                            addr =gt addr_s

                                                            clk =gt clk

                                                            dout =gt data_s

                                                            en =gt rd_s

                                                            sinit=gt rstn

                                                            )

                                                            end Behavioral

                                                            45

                                                            SYNCHRONIZATION BLOCK CODE

                                                            library IEEE

                                                            use IEEESTD_LOGIC_1164ALL

                                                            use IEEESTD_LOGIC_ARITHALL

                                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                                            entity sync_unit is port(

                                                            clkin std_logic

                                                            rstnin std_logic

                                                            hsyncout std_logic ---horizontal synch pulse(row)

                                                            vsyncout std_logic --vertical sync

                                                            pulse(frame) 111110100000000000

                                                            v_en_regout std_logic

                                                            h_enout std_logic

                                                            v_enout std_logic

                                                            line_countout integer

                                                            c_hsout integer )

                                                            end sync_unit

                                                            architecture arch_sync_unit of sync_unit is

                                                            signal h_en_sv_en_sstd_logic

                                                            signal line_count_sc_hs_sinteger

                                                            begin

                                                            c_hs lt= c_hs_s

                                                            line_count lt= line_count_s

                                                            h_en lt= h_en_s

                                                            v_en lt= v_en_s

                                                            -- hsync counter

                                                            process(clkrstn)

                                                            begin

                                                            if(rstn = 0)then

                                                            46

                                                            c_hs_s lt= 0

                                                            else

                                                            if(clk = 1 and clkevent)then

                                                            if(c_hs_s=793)then

                                                            c_hs_s lt= 0

                                                            else

                                                            c_hs_s lt= c_hs_s +1

                                                            end if

                                                            end if

                                                            end if

                                                            end process

                                                            --vsync counter

                                                            --vertical line counter

                                                            process(h_en_srstn)

                                                            begin

                                                            if(rstn = 0)then

                                                            line_count_s lt= 0

                                                            elsif(h_en_s = 1 and h_en_sevent)then

                                                            line_count_s lt= line_count_s + 1

                                                            if (line_count_s=515)then

                                                            line_count_s lt= 0

                                                            end if

                                                            end if

                                                            --end if

                                                            end process

                                                            --hysnc pulse

                                                            process(clkrstn)

                                                            begin

                                                            if(rstn = 0)then

                                                            47

                                                            hsync lt= 1

                                                            elsif(clk = 1 and clkevent)then

                                                            if(c_hs_s lt=95)then

                                                            hsync lt= 0

                                                            else

                                                            hsync lt= 1

                                                            end if

                                                            end if

                                                            --end if

                                                            end process

                                                            process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                            this period)

                                                            begin

                                                            if(rstn = 0)then

                                                            h_en_s lt= 0

                                                            elsif(clk = 1 and clkevent)then

                                                            if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                            h_en_s lt= 1

                                                            else

                                                            h_en_s lt= 0

                                                            end if

                                                            end if

                                                            --end if

                                                            end process

                                                            ---vysnc pulse

                                                            process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                            begin

                                                            if(rstn = 0)then

                                                            48

                                                            v_en_s lt= 0

                                                            elsif(clk = 1 and clkevent)then

                                                            if((line_count_sgt=35 and line_count_slt=515))then

                                                            v_en_s lt= 1

                                                            else

                                                            v_en_s lt= 0

                                                            end if

                                                            end if

                                                            --end if

                                                            end process

                                                            process(clkrstn) --Vertical Sync Front porch Back Porch

                                                            begin

                                                            if(rstn = 0)then

                                                            vsync lt= 1

                                                            elsif(clk = 1 and clkevent)then

                                                            if(line_count_s lt2)then

                                                            vsync lt= 0

                                                            else

                                                            vsync lt= 1

                                                            end if

                                                            end if

                                                            --end if

                                                            end process

                                                            process(clk)

                                                            begin

                                                            if(clk = 1 and clkevent)then

                                                            v_en_reglt=v_en_s

                                                            end if

                                                            end process

                                                            end arch_sync_unit

                                                            49

                                                            PIXEL GENERATOR

                                                            library IEEE

                                                            use IEEESTD_LOGIC_1164ALL

                                                            use IEEESTD_LOGIC_ARITHALL

                                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                                            entity pixelgen is port(

                                                            clkin std_logic

                                                            rstnin std_logic

                                                            hsyncin std_logic

                                                            vsyncin std_logic

                                                            h_enin std_logic

                                                            v_enin std_logic

                                                            v_en_reg in std_logic

                                                            data_inin std_logic_vector(0 downto 0)

                                                            line_countin integer

                                                            c_hsin integer

                                                            red out std_logic_vector(0 downto 0)

                                                            greenout std_logic_vector(0 downto 0)

                                                            blueout std_logic_vector(0 downto 0)

                                                            addrout std_logic_vector(17 downto 0)

                                                            rdout std_logic

                                                            )

                                                            end pixelgen

                                                            architecture behv of pixelgen is

                                                            signal addr_sstd_logic_vector(17 downto 0)

                                                            begin

                                                            addr lt= addr_s

                                                            process(clkrstn)

                                                            begin

                                                            50

                                                            if(rstn = 0)then

                                                            rd lt= 0

                                                            red lt= 0

                                                            green lt= 0

                                                            blue lt= 0

                                                            addr_s lt= x0000 amp 00

                                                            elsif (clk = 1 and clkevent)then

                                                            if((v_en=1) and (v_en_reg=0))then

                                                            addr_slt=(others =gt0)

                                                            end if

                                                            if(v_en=1 and h_en=1)then

                                                            if(line_countlt=514)then

                                                            if(c_hs lt= 655)then

                                                            rd lt= 1

                                                            red lt= data_in

                                                            green lt= data_in

                                                            blue lt= data_in

                                                            addr_s lt=addr_s +000000000000000001

                                                            else

                                                            rd lt= 0

                                                            red lt= 0

                                                            green lt= 1

                                                            blue lt= 0

                                                            end if

                                                            else

                                                            rd lt= 0

                                                            red lt= 1

                                                            green lt= 1

                                                            blue lt= 1

                                                            51

                                                            end if

                                                            if(addr_s=111011111111111111)then

                                                            addr_slt= (others =gt0)

                                                            end if

                                                            else

                                                            red lt= 0

                                                            green lt= 0

                                                            blue lt= 0

                                                            end if

                                                            end if

                                                            -- end if

                                                            --end if

                                                            end process

                                                            end behv

                                                            52

                                                            12 FPGA ARCHITECTURE

                                                            A field-programmable gate array (FPGA) is a semiconductor device that can

                                                            be configured by the customer or designer after manufacturingmdashhence the

                                                            name field-programmable FPGAs are programmed using a logic circuit

                                                            diagram or a source code in a hardware description language (HDL) to specify

                                                            how the chip will work They can be used to implement any logical function

                                                            that an application-specific integrated circuit (ASIC) could perform but the

                                                            ability to update the functionality after shipping offers advantages for many

                                                            applications

                                                            FPGAs contain programmable logic components called logic blocks and a

                                                            hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                            togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                            can be configured to perform complex combinational functions or merely

                                                            simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                            include memory elements which may be simple flip-flops or more complete

                                                            blocks of memory

                                                            History The FPGA industry sprouted from programmable read only

                                                            memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                            both had the option of being programmed in batches in a factory or in the field

                                                            (field programmable) however programmable logic was hard-wired between

                                                            logic gates

                                                            Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                            first commercially viable field programmable gate array in 1985 ndash the

                                                            XC2064 The XC2064 had programmable gates and programmable

                                                            interconnects between gates the beginnings of a new technology and market

                                                            The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                            input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                            into the National Inventors Hall of Fame for his invention

                                                            53

                                                            Some of the industryrsquos foundational concepts and technologies for

                                                            programmable logic arrays gates and logic blocks are founded in patents

                                                            awarded to David W Page and LuVerne R Peterson in 1985

                                                            In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                            proposed by Steve Casselman to develop a computer that would implement

                                                            600000 reprogrammable gates Casselman was successful and the system was

                                                            awarded a patent in 1992

                                                            Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                            1990s when competitors sprouted up eroding significant market-share By

                                                            1993 Actel was serving about 18 percent of the market

                                                            The 1990s were an explosive period of time for FPGAs both in sophistication

                                                            and the volume of production In the early 1990s FPGAs were primarily used

                                                            in telecommunications and networking By the end of the decade FPGAs

                                                            found their way into consumer automotive and industrial applications

                                                            FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                            algorithm technology and FPGAs to create a sound recognition device

                                                            Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                            to decide the configuration needed to accomplish a sound recognition task

                                                            121 Modern developments

                                                            A recent trend has been to take the coarse-grained architectural

                                                            approach a step further by combining the logic blocks and interconnects of

                                                            traditional FPGAs with embedded microprocessors and related peripherals to

                                                            form a complete system on a programmable chip This work mirrors the

                                                            architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                            Group which combined a reconfigurable CPU architecture on a single chip

                                                            called the SB24 That work was done in 1982 Examples of such hybrid

                                                            technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                            which include one or more PowerPC processors embedded within the FPGAs

                                                            54

                                                            logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                            processor in combination with Atmels programmable logic architecture

                                                            An alternate approach to using hard-macro processors is to make use of soft

                                                            processor cores that are implemented within the FPGA logic (See Soft

                                                            processors below)

                                                            As previously mentioned many modern FPGAs have the ability to be

                                                            reprogrammed at run time and this is leading to the idea of reconfigurable

                                                            computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                            suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                            example of a reconfigurable soft processor implemented on FPGAs However

                                                            it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                            to a specific program

                                                            Additionally new non-FPGA architectures are beginning to emerge

                                                            Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                            hybrid approach by providing an array of processor cores and FPGA-like

                                                            programmable cores on the same chip

                                                            Gates

                                                            1987 9000 gates Xilinx

                                                            1992 600000 Naval Surface Warfare Department

                                                            Early 2000s Millions

                                                            Market size

                                                            1985 First commercial FPGA technology invented by Xilinx

                                                            1987 $14 million

                                                            ~1993 gt$385 million

                                                            2005 $19 billion

                                                            2010 estimates $275 billion

                                                            55

                                                            122 FPGA Comparisons

                                                            Historically FPGAs have been slower less energy efficient and generally

                                                            achieved less functionality than their fixed ASIC counterparts A combination

                                                            of volume fabrication improvements research and development and the IO

                                                            capabilities of new supercomputers have largely closed the performance gap

                                                            between ASICs and FPGAs

                                                            Advantages include a shorter time to market ability to re-program in the field

                                                            to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                            middle road by developing their hardware on ordinary FPGAs but

                                                            manufacture their final version so it can no longer be modified after the design

                                                            has been committed

                                                            Xilinx claims that several market and technology dynamics are changing the

                                                            ASICFPGA paradigm

                                                            IC costs are rising aggressively

                                                            ASIC complexity has bolstered development time and costs

                                                            RampD resources and headcount is decreasing

                                                            Revenue losses for slow time-to-market are increasing

                                                            Financial constraints in a poor economy are driving low-cost technologies

                                                            These trends make FPGAs a better alternative than ASICs for a growing

                                                            number of higher-volume applications than they have been historically used

                                                            for which the company blames for the growing number of FPGA design starts

                                                            (see History)

                                                            The primary differences between CPLDs and FPGAs are architectural A

                                                            CPLD has a somewhat restrictive structure consisting of one or more

                                                            programmable sum-of-products logic arrays feeding a relatively small number

                                                            of clocked registers The result of this is less flexibility with the advantage of

                                                            more predictable timing delays and a higher logic-to-interconnect ratio The

                                                            FPGA architectures on the other hand are dominated by interconnect This

                                                            56

                                                            makes them far more flexible (in terms of the range of designs that are

                                                            practical for implementation within them) but also far more complex to design

                                                            for

                                                            Another notable difference between CPLDs and FPGAs is the presence in most

                                                            FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                            and embedded memories as well as to have logic blocks implement decoders

                                                            or mathematical functions

                                                            Some FPGAs have the capability of partial re-configuration that lets one

                                                            portion of the device be re-programmed while other portions continue running

                                                            123 Applications

                                                            Applications of FPGAs include digital signal processing software-defined

                                                            radio aerospace and defense systems ASIC prototyping medical imaging

                                                            computer vision speech recognition cryptography bioinformatics computer

                                                            hardware emulation radio astronomy and a growing range of other areas

                                                            FPGAs originally began as competitors to CPLDs and competed in a similar

                                                            space that of glue logic for PCBs As their size capabilities and speed

                                                            increased they began to take over larger and larger functions to the state where

                                                            some are now marketed as full systems on chips (SoC) Particularly with the

                                                            introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                            applications which had traditionally been the sole reserve of DSPs began to

                                                            incorporate FPGAs instead

                                                            FPGAs especially find applications in any area or algorithm that can make use

                                                            of the massive parallelism offered by their architecture One such area is code

                                                            breaking in particular brute-force attack of cryptographic algorithms

                                                            FPGAs are increasingly used in conventional high performance computing

                                                            applications where computational kernels such as FFT or Convolution are

                                                            performed on the FPGA instead of a microprocessor

                                                            57

                                                            The inherent parallelism of the logic resources on an FPGA allows for

                                                            considerable computational throughput even at a low MHz clock rates The

                                                            flexibility of the FPGA allows for even higher performance by trading off

                                                            precision and range in the number format for an increased number of parallel

                                                            arithmetic units This has driven a new type of processing called

                                                            reconfigurable computing where time intensive tasks are offloaded from

                                                            software to FPGAs

                                                            The adoption of FPGAs in high performance computing is currently limited by

                                                            the complexity of FPGA design compared to conventional software and the

                                                            extremely long turn-around times of current design tools where 4-8 hours wait

                                                            is necessary after even minor changes to the source code

                                                            Traditionally FPGAs have been reserved for specific vertical applications

                                                            where the volume of production is small For these low-volume applications

                                                            the premium that companies pay in hardware costs per unit for a

                                                            programmable chip is more affordable than the development resources spent

                                                            on creating an ASIC for a low-volume application Today new cost and

                                                            performance dynamics have broadened the range of viable applications

                                                            Architecture

                                                            The most common FPGA architecture consists of an array of configurable

                                                            logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                            channels have the same width (number of wires) Multiple IO pads may fit

                                                            into the height of one row or the width of one column in the array

                                                            An application circuit must be mapped into an FPGA with adequate resources

                                                            While the number of CLBs and IOs required is easily determined from the

                                                            design the number of routing tracks needed may vary considerably even

                                                            among designs with the same amount of logic (For example a crossbar switch

                                                            requires much more routing than a systolic array with the same gate count)

                                                            Since unused routing tracks increase the cost (and decrease the performance)

                                                            of the part without providing any benefit FPGA manufacturers try to provide

                                                            58

                                                            just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                            can be routed This is determined by estimates such as those derived from

                                                            Rents rule or by experiments with existing designs

                                                            The FPGA is an array or island-style FPGA It consists of an array of logic

                                                            blocks and routing channels Two IO pads fit into the height of one row or the

                                                            width of one column as shown below All the routing channels have the same

                                                            width (number of wires)

                                                            121 FPGA structure

                                                            A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                            flip-flop as shown below In recent years manufacturers have started moving

                                                            to 6-input LUTs in their high performance parts claiming increased

                                                            performance

                                                            Typical logic block

                                                            59

                                                            There is only one output which can be either the registered or the unregistered

                                                            LUT output The logic block has four inputs for the LUT and a clock input

                                                            Since clock signals (and often other high-fanout signals) are normally routed

                                                            via special-purpose dedicated routing networks in commercial FPGAs they

                                                            and other signals are separately managed

                                                            For this example architecture the locations of the FPGA logic block pins are

                                                            shown below

                                                            Logic Block Pin Locations

                                                            Each input is accessible from one side of the logic block while the output pin

                                                            can connect to routing wires in both the channel to the right and the channel

                                                            below the logic block

                                                            Each logic block output pin can connect to any of the wiring segments in the

                                                            channels adjacent to it

                                                            Similarly an IO pad can connect to any one of the wiring segments in the

                                                            channel adjacent to it For example an IO pad at the top of the chip can

                                                            connect to any of the W wires (where W is the channel width) in the horizontal

                                                            channel immediately below it

                                                            Generally the FPGA routing is unsegmented That is each wiring segment

                                                            spans only one logic block before it terminates in a switch box By turning on

                                                            60

                                                            some of the programmable switches within a switch box longer paths can be

                                                            constructed For higher speed interconnect some FPGA architectures use

                                                            longer routing lines that span multiple logic blocks

                                                            Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                            this architecture when a wire enters a switch box there are three

                                                            programmable switches that allow it to connect to three other wires in adjacent

                                                            channel segments The pattern or topology of switches used in this

                                                            architecture is the planar or domain-based switch box topology In this switch

                                                            box topology a wire in track number one connects only to wires in track

                                                            number one in adjacent channel segments wires in track number 2 connect

                                                            only to other wires in track number 2 and so on The figure below illustrates

                                                            the connections in a switch box

                                                            Switch box topology

                                                            Modern FPGA families expand upon the above capabilities to include higher

                                                            level functionality fixed into the silicon Having these common functions

                                                            embedded into the silicon reduces the area required and gives those functions

                                                            increased speed compared to building them from primitives Examples of these

                                                            include multipliers generic DSP blocks embedded processors high speed IO

                                                            logic and embedded memories

                                                            61

                                                            FPGAs are also widely used for systems validation including pre-silicon

                                                            validation post-silicon validation and firmware development This allows chip

                                                            companies to validate their design before the chip is produced in the factory

                                                            reducing the time to market

                                                            124 FPGA Design and Programming

                                                            To define the behavior of the FPGA the user provides a hardware description

                                                            language (HDL) or a schematic design The HDL form might be easier to work

                                                            with when handling large structures because its possible to just specify them

                                                            numerically rather than having to draw every piece by hand On the other hand

                                                            schematic entry can allow for easier visualisation of a design

                                                            Then using an electronic design automation tool a technology-mapped netlist

                                                            is generated The netlist can then be fitted to the actual FPGA architecture

                                                            using a process called place-and-route usually performed by the FPGA

                                                            companys proprietary place-and-route software The user will validate the

                                                            map place and route results via timing analysis simulation and other

                                                            verification methodologies Once the design and validation process is

                                                            complete the binary file generated (also using the FPGA companys

                                                            proprietary software) is used to (re)configure the FPGA

                                                            Going from schematicHDL source files to actual configuration The source

                                                            files are fed to a software suite from the FPGACPLD vendor that through

                                                            different steps will produce a file This file is then transferred to the

                                                            FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                            like an EEPROM

                                                            The most common HDLs are VHDL and Verilog although in an attempt to

                                                            reduce the complexity of designing in HDLs which have been compared to the

                                                            equivalent of assembly languages there are moves to raise the abstraction level

                                                            through the introduction of alternative languages

                                                            62

                                                            To simplify the design of complex systems in FPGAs there exist libraries of

                                                            predefined complex functions and circuits that have been tested and optimized

                                                            to speed up the design process These predefined circuits are commonly called

                                                            IP cores and are available from FPGA vendors and third-party IP suppliers

                                                            (rarely free and typically released under proprietary licenses) Other

                                                            predefined circuits are available from developer communities such as Open

                                                            Cores (typically free and released under the GPL BSD or similar license) and

                                                            other sources

                                                            In a typical design flow an FPGA application developer will simulate the

                                                            design at multiple stages throughout the design process Initially the RTL

                                                            description in VHDL or Verilog is simulated by creating test benches to

                                                            simulate the system and observe results Then after the synthesis engine has

                                                            mapped the design to a netlist the netlist is translated to a gate level

                                                            description where simulation is repeated to confirm the synthesis proceeded

                                                            without errors Finally the design is laid out in the FPGA at which point

                                                            propagation delays can be added and the simulation run again with these

                                                            values back-annotated onto the netlist

                                                            Basic Process Technology Types

                                                            SRAM - based on static memory technology In-system programmable and re-

                                                            programmable Requires external boot devices CMOS

                                                            Antifuse - One-time programmable CMOS

                                                            EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                            one-time programmable in production because of plastic packaging

                                                            Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                            EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                            technology Can be erased even in plastic packages Some but not all

                                                            EEPROM devices can be in-system programmed CMOS

                                                            Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                            packages Some but not all flash devices can be in-system programmed

                                                            63

                                                            Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                            therefore less expensive to manufacture CMOS

                                                            Fuse - One-time programmable Bipolar

                                                            Major Manufacturers

                                                            Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                            rivals Together they control over 80 percent of the market with Xilinx alone

                                                            representing over 50 percent

                                                            Xilinx also provides free Windows and Linux design software while Altera

                                                            provides free Windows tools the Solaris and Linux tools are only available via

                                                            a rental scheme

                                                            Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                            (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                            Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                            focused CSSP no general purpose FPGAs)

                                                            125 FPGA prototype

                                                            FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                            prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                            hardware verification and early software development

                                                            Main stream verification methods for hardware design and early software and

                                                            firmware co-design has become mainstream Prototyping SoC and ASIC

                                                            design on FPGA has become a good method to do this

                                                            64

                                                            Reasons why Prototyping is important

                                                            1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                            functionally correct This is compared to designers only relying on software

                                                            simulations to verify that their hardware design is sound Simulation speed and

                                                            modeling accuracy limitations hinder this development

                                                            2 Due to time constrains many projects cannot wait until the silicon is back from

                                                            the foundry to start on software tests FPGA prototyping allows for much more

                                                            time in area of software development and testing at the software-hardware

                                                            integration stage This allows many unforeseen software bugs that appear due

                                                            to todays array of operating systems applications and hardware

                                                            3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                            system work well together off the simulation stage and in actual form

                                                            4 Prototyping has the added advantage as demo platforms to SoC clients

                                                            bringing in interest early This speeds up the overall development cycle and

                                                            allows for more enhancement or improvement to the chip features as it would

                                                            otherwise have been

                                                            65

                                                            13 SIMULATION RESULTS

                                                            66

                                                            GATE LEVEL

                                                            FigVGA CONTROLLER

                                                            67

                                                            GATE LEVEL

                                                            Fig SYNC UNIT

                                                            68

                                                            TECHNOLOGY SCHEMATIC

                                                            69

                                                            14 APPLICATIONS

                                                            Motion Tracking Systems

                                                            Image processing systems

                                                            Displaying Systems

                                                            141 CONCLUTION

                                                            Hence by implementing FPGA taking a processor-generated picture

                                                            (pixilated) from memory space the Controller provides digital RGB values for

                                                            each pixel as well as horizontal and vertical synchronization signals in order

                                                            to correctly display the picture on a connected monitor

                                                            70

                                                            REFERENCES

                                                            1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                            2 A VGA display controller by Eduardo Sanchez

                                                            3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                            71

                                                            • Pixel
                                                              • 6 SIGNALS IN VGA CONTROLLER
                                                              • Gates
                                                              • Market size
                                                              • 122 FPGA Comparisons

                                                              0

                                                              XC3

                                                              S120

                                                              0E

                                                              2 14 2

                                                              8

                                                              5

                                                              1

                                                              6

                                                              0

                                                              9

                                                              6

                                                              5

                                                              0

                                                              4

                                                              XC3

                                                              S160

                                                              0E

                                                              2 18 3

                                                              6

                                                              6

                                                              6

                                                              3

                                                              5

                                                              5

                                                              2

                                                              6

                                                              4

                                                              8

                                                              Each block RAM contains 18432 bits of fast static RAM 16K bits of

                                                              which is allocated to data storage and in some memory configurations an

                                                              additional 2K bits allocated to parity or additional plus data bits Physically

                                                              the block RAM memory has two completely independent access ports labeled

                                                              Port A and Port B The structure is fully symmetrical and both ports are

                                                              interchangeable and both ports support data read and write operations Each

                                                              memory port is synchronous with its own clock clock enable and write

                                                              enable Read operations are also synchronous and require a clock edge and

                                                              clock enable Though physically a dual-port memory block RAM simulates

                                                              single-port memory in an application as shown in Figure 1 Furthermore each

                                                              block memory supports multiple configurations or aspect ratios Table 3

                                                              summarizes the essential SelectRAM features Cascade multiple block RAMs

                                                              to create deeper and wider memory organizations with a minimal timing

                                                              penalty incurred through specialized routing resources

                                                              31

                                                              32

                                                              10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                              Why (V) HDL

                                                              Interoperability

                                                              Technology independence

                                                              Design reuse

                                                              Several levels of abstraction

                                                              Readability

                                                              Standard language

                                                              Widely supported

                                                              What is VHDL

                                                              VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                              IC)

                                                              Design specification language

                                                              Design entry language

                                                              Design simulation language

                                                              Design documentation language

                                                              An alternative to schematics

                                                              101 Brief History

                                                              VHDL Was developed in the early 1980s for managing design problems that

                                                              involved large circuits and multiple teams of engineers

                                                              Funded by US Department of Defence

                                                              33

                                                              The first publicly available version was released in 1985

                                                              In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                              presented with a proposal to standardize the VHDL

                                                              In 1987 standardization =gt IEEE 1076-1987

                                                              An improved version of the language was released in 1994 =gt IEEE

                                                              standard1076-1993

                                                              102 Related Standards

                                                              IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                              impedance

                                                              Soon after IEEE 1076-1987 was released simulator companies began using

                                                              their own non-standard types =gt VHDL was becoming a nonstandard

                                                              IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                              definitions for a nine-valued data type std_logic

                                                              IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                              to actual hardware

                                                              Defines eg two numeric types signed and unsigned

                                                              VHDL Environment

                                                              Design Units

                                                              34

                                                              Segments of VHDL code that can be compiled separately and stored in a

                                                              library

                                                              Entities

                                                              A black box with interface definition

                                                              Defines the inputsoutputs of a component (define pins)

                                                              A way to represent modularity in VHDL

                                                              Similar to symbol in schematic

                                                              Entity declaration describes entity

                                                              Eg

                                                              entity Comparator is

                                                              port (A B in std_logic_vector(7 downto0)

                                                              EQ out std_logic)

                                                              end Comparator

                                                              Ports Provide channels of communication between the component and its

                                                              environment

                                                              Each port must have a name direction and a type

                                                              An entity may have NO port declaration

                                                              35

                                                              Port directions In A value of a port can be read inside the component but cannot be assigned

                                                              Multiple reads of port are allowed

                                                              Out Assignments can be made to a port but data from a port cannot be read

                                                              Multiple assignments are allowed

                                                              In out Bi-directional assignments can be made and data can be read Multiple

                                                              assignments are allowed

                                                              Buffer An out port with read capability May have at most one assignment

                                                              (are not recommended)

                                                              Architectures Every entity has at least one architecture

                                                              One entity can have several architectures

                                                              Architectures can describe design using

                                                              BehaviorndashStructurendashDataflow

                                                              Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                              Transfer Level)ndashBehavioral level

                                                              Configuration declaration links architecture to entity

                                                              Eg

                                                              Architecture Comparator1 of Comparator is

                                                              Begin

                                                              EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                              End Comparator1

                                                              Configurations Links entity declaration and architecture body together

                                                              Concept of default configuration is a bit messy in VHDL lsquo87

                                                              ndashLast architecture analyzed links to entity

                                                              Can be used to change simulation behavior without re-analyzing the VHDL

                                                              source

                                                              Complex configuration declarations are ignored in synthesis

                                                              36

                                                              Some entities can have eggate level architecture and behavioral

                                                              architecture

                                                              Are always optional

                                                              PackagesPackages contain information common to many design units

                                                              1 Package declaration

                                                              --constant declarations

                                                              ndashtype and subtype declarations

                                                              ndashfunction and procedure declarations

                                                              ndashglobal signal declarations

                                                              ndashfile declarations

                                                              ndashcomponent declarations

                                                              2 Package body

                                                              ndashis not necessary needed

                                                              ndashfunction bodies

                                                              ndashprocedure bodies

                                                              Packages are meant for encapsuling data which can be shared globally among

                                                              Several design units

                                                              Consists of declaration part and optional body part

                                                              Package declaration can contain

                                                              ndashtype and subtype declarations

                                                              ndashsubprograms

                                                              ndashconstants

                                                              Alias declarations

                                                              ndashglobal signal declarations

                                                              ndashfile declarations

                                                              ndashcomponent declarations

                                                              37

                                                              Package body consists of

                                                              ndashsubprogram declarations and bodies

                                                              ndashtype and subtype declarations

                                                              ndash deferred constants

                                                              ndash file declarations

                                                              LibrariesCollection of VHDL design units (database)

                                                              1 Packages

                                                              package declaration

                                                              package body

                                                              2 Entities (entity declaration)

                                                              3 Architectures (architecture body)

                                                              4 Configurations (configuration declarations)

                                                              Usually directory in UNIX file system

                                                              Can be also any other kind of database

                                                              Levels of Abstraction

                                                              VHDL supports many possible styles of design description which differ

                                                              primarily in how closely they relate to the HW

                                                              It is possible to describe a circuit in a number of ways

                                                              Structural-------

                                                              Dataflow ------- Higher level of abstraction

                                                              Behavioral -------

                                                              Structural VHDL description

                                                              Circuit is described in terms of its components

                                                              From a low-level description (eg transistor-level description) to a high level

                                                              description (eg block diagram)

                                                              For large circuits a low-level description quickly becomes impractical

                                                              103 Dataflow VHDL Description

                                                              38

                                                              Circuit is described in terms of how data moves through the system

                                                              In the dataflow style you describe how information flows between registers in

                                                              the system

                                                              The combinational logic is described at a relatively high level the placement

                                                              and operation of registers is specified quite precisely

                                                              The behavior of the system over the time is defined by registers

                                                              There are no build-in registers in VHDL-language

                                                              ndashEither lower level description

                                                              ndashor behavioral description of sequential elements is needed

                                                              The lower level register descriptions must be created or obtained

                                                              If there is no 3rd party models for registers =gt you must write the behavioral

                                                              description of registers

                                                              The behavioral description can be provided in the form of

                                                              subprograms(functions or procedures)

                                                              104 Behavioral VHDL Description

                                                              Circuit is described in terms of its operation over time

                                                              39

                                                              Representation might include eg state diagrams timing diagrams and

                                                              algorithmic descriptions

                                                              The concept of time may be expressed precisely using delays (eg A lt= B

                                                              after 10 ns)

                                                              If no actual delay is used order of sequential operations is defined

                                                              In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                              timing specifications

                                                              The actual timing results depend on implementation technology and efficiency

                                                              of synthesis tool

                                                              There are a few tools for behavioral synthesis

                                                              Concurrent Vs Sequential

                                                              Processes

                                                              Basic simulation concept in VHDL

                                                              VHDL description can always be broken up to interconnected processes

                                                              Quite similar to Unix process

                                                              40

                                                              Process keyword in VHDL

                                                              Process statement is concurrent statement

                                                              Statements inside process statements are sequential statements

                                                              Process must contain either sensitivity list or wait statement(s) but NOT both

                                                              Sensitivity list or wait statement(s) contains signals which wakes process up

                                                              General format

                                                              Process [(sensitivity_list)]

                                                              process_declarative_part

                                                              Begin

                                                              process_statements

                                                              [wait_statement]

                                                              End process

                                                              41

                                                              11 VGA CONTROLLER CODE

                                                              library IEEE

                                                              use IEEESTD_LOGIC_1164ALL

                                                              use IEEESTD_LOGIC_ARITHALL

                                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                                              use IEEEnumeric_stdALL

                                                              ---- Uncomment the following library declaration if instantiating

                                                              ---- any Xilinx primitives in this code

                                                              --library UNISIM

                                                              --use UNISIMVComponentsall

                                                              entity vga_controller is port(clk50min std_logic

                                                              rstnin std_logic

                                                              hsyncout std_logic

                                                              vsyncout std_logic

                                                              redout std_logic_vector(0 downto 0)

                                                              greenout std_logic_vector(0 downto 0)

                                                              blueout std_logic_vector(0 downto 0))

                                                              end vga_controller

                                                              architecture Behavioral of vga_controller is

                                                              component memory

                                                              port (

                                                              addr IN std_logic_VECTOR(17 downto 0)

                                                              clk IN std_logic

                                                              dout OUT std_logic_VECTOR(0 downto 0)

                                                              en IN std_logic

                                                              sinit IN std_logic)

                                                              END component

                                                              42

                                                              component sync_unit port(

                                                              clkin std_logic

                                                              rstnin std_logic

                                                              hsyncout std_logic

                                                              ---horizontal synch pulse(row)

                                                              vsyncout std_logic

                                                              --vertical sync pulse(frame) 111110100000000000

                                                              v_en_regout std_logic

                                                              h_enout std_logic

                                                              v_enout std_logic

                                                              line_countinout integer

                                                              c_hsinout integer)

                                                              end component

                                                              component pixelgen port(

                                                              clkin std_logic

                                                              rstnin std_logic

                                                              hsyncin std_logic

                                                              vsyncin std_logic

                                                              h_enin std_logic

                                                              v_enin std_logic

                                                              v_en_reg in std_logic

                                                              data_inin std_logic_vector(0 downto 0)

                                                              line_countin integer

                                                              c_hsin integer

                                                              red out std_logic_vector(0 downto 0)

                                                              greenout std_logic_vector(0 downto 0)

                                                              blueout std_logic_vector(0 downto 0)

                                                              addrout std_logic_vector(17 downto 0)

                                                              rdout std_logic)

                                                              end component

                                                              43

                                                              signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                              signal line_count_sc_hs_saddrinteger

                                                              signal data_sstd_logic_vector(0 downto 0)

                                                              signal addr_sstd_logic_vector(17 downto 0)

                                                              begin

                                                              --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                              process(clk50mrstn)

                                                              begin

                                                              if(rstn = 0)then

                                                              clk lt= 0

                                                              elsif(clk50m = 1 and clk50mevent)then

                                                              clk lt= not clk

                                                              end if

                                                              --end if

                                                              end process

                                                              hsync lt= hsync_s

                                                              vsync lt= vsync_s

                                                              sync sync_unit port map(

                                                              clk =gt clk

                                                              rstn =gt rstn

                                                              hsync =gt hsync_s

                                                              vsync =gt vsync_s

                                                              v_en_reg =gt v_en_reg_s

                                                              line_count =gt line_count_s

                                                              h_en =gt h_en_s

                                                              v_en =gt v_en_s

                                                              c_hs =gt c_hs_s

                                                              )

                                                              44

                                                              pixel pixelgen port map(

                                                              clk =gt clk

                                                              rstn =gt rstn

                                                              hsync =gt hsync_s

                                                              vsync =gt vsync_s

                                                              h_en =gt h_en_s

                                                              v_en =gt v_en_s

                                                              v_en_reg =gt v_en_reg_s

                                                              data_in =gt data_s

                                                              line_count=gt line_count_s

                                                              c_hs =gt c_hs_s

                                                              red =gt red

                                                              green =gt green

                                                              blue =gt blue

                                                              addr =gt addr_s

                                                              rd =gt rd_s

                                                              )

                                                              mem12 memory port map(

                                                              addr =gt addr_s

                                                              clk =gt clk

                                                              dout =gt data_s

                                                              en =gt rd_s

                                                              sinit=gt rstn

                                                              )

                                                              end Behavioral

                                                              45

                                                              SYNCHRONIZATION BLOCK CODE

                                                              library IEEE

                                                              use IEEESTD_LOGIC_1164ALL

                                                              use IEEESTD_LOGIC_ARITHALL

                                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                                              entity sync_unit is port(

                                                              clkin std_logic

                                                              rstnin std_logic

                                                              hsyncout std_logic ---horizontal synch pulse(row)

                                                              vsyncout std_logic --vertical sync

                                                              pulse(frame) 111110100000000000

                                                              v_en_regout std_logic

                                                              h_enout std_logic

                                                              v_enout std_logic

                                                              line_countout integer

                                                              c_hsout integer )

                                                              end sync_unit

                                                              architecture arch_sync_unit of sync_unit is

                                                              signal h_en_sv_en_sstd_logic

                                                              signal line_count_sc_hs_sinteger

                                                              begin

                                                              c_hs lt= c_hs_s

                                                              line_count lt= line_count_s

                                                              h_en lt= h_en_s

                                                              v_en lt= v_en_s

                                                              -- hsync counter

                                                              process(clkrstn)

                                                              begin

                                                              if(rstn = 0)then

                                                              46

                                                              c_hs_s lt= 0

                                                              else

                                                              if(clk = 1 and clkevent)then

                                                              if(c_hs_s=793)then

                                                              c_hs_s lt= 0

                                                              else

                                                              c_hs_s lt= c_hs_s +1

                                                              end if

                                                              end if

                                                              end if

                                                              end process

                                                              --vsync counter

                                                              --vertical line counter

                                                              process(h_en_srstn)

                                                              begin

                                                              if(rstn = 0)then

                                                              line_count_s lt= 0

                                                              elsif(h_en_s = 1 and h_en_sevent)then

                                                              line_count_s lt= line_count_s + 1

                                                              if (line_count_s=515)then

                                                              line_count_s lt= 0

                                                              end if

                                                              end if

                                                              --end if

                                                              end process

                                                              --hysnc pulse

                                                              process(clkrstn)

                                                              begin

                                                              if(rstn = 0)then

                                                              47

                                                              hsync lt= 1

                                                              elsif(clk = 1 and clkevent)then

                                                              if(c_hs_s lt=95)then

                                                              hsync lt= 0

                                                              else

                                                              hsync lt= 1

                                                              end if

                                                              end if

                                                              --end if

                                                              end process

                                                              process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                              this period)

                                                              begin

                                                              if(rstn = 0)then

                                                              h_en_s lt= 0

                                                              elsif(clk = 1 and clkevent)then

                                                              if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                              h_en_s lt= 1

                                                              else

                                                              h_en_s lt= 0

                                                              end if

                                                              end if

                                                              --end if

                                                              end process

                                                              ---vysnc pulse

                                                              process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                              begin

                                                              if(rstn = 0)then

                                                              48

                                                              v_en_s lt= 0

                                                              elsif(clk = 1 and clkevent)then

                                                              if((line_count_sgt=35 and line_count_slt=515))then

                                                              v_en_s lt= 1

                                                              else

                                                              v_en_s lt= 0

                                                              end if

                                                              end if

                                                              --end if

                                                              end process

                                                              process(clkrstn) --Vertical Sync Front porch Back Porch

                                                              begin

                                                              if(rstn = 0)then

                                                              vsync lt= 1

                                                              elsif(clk = 1 and clkevent)then

                                                              if(line_count_s lt2)then

                                                              vsync lt= 0

                                                              else

                                                              vsync lt= 1

                                                              end if

                                                              end if

                                                              --end if

                                                              end process

                                                              process(clk)

                                                              begin

                                                              if(clk = 1 and clkevent)then

                                                              v_en_reglt=v_en_s

                                                              end if

                                                              end process

                                                              end arch_sync_unit

                                                              49

                                                              PIXEL GENERATOR

                                                              library IEEE

                                                              use IEEESTD_LOGIC_1164ALL

                                                              use IEEESTD_LOGIC_ARITHALL

                                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                                              entity pixelgen is port(

                                                              clkin std_logic

                                                              rstnin std_logic

                                                              hsyncin std_logic

                                                              vsyncin std_logic

                                                              h_enin std_logic

                                                              v_enin std_logic

                                                              v_en_reg in std_logic

                                                              data_inin std_logic_vector(0 downto 0)

                                                              line_countin integer

                                                              c_hsin integer

                                                              red out std_logic_vector(0 downto 0)

                                                              greenout std_logic_vector(0 downto 0)

                                                              blueout std_logic_vector(0 downto 0)

                                                              addrout std_logic_vector(17 downto 0)

                                                              rdout std_logic

                                                              )

                                                              end pixelgen

                                                              architecture behv of pixelgen is

                                                              signal addr_sstd_logic_vector(17 downto 0)

                                                              begin

                                                              addr lt= addr_s

                                                              process(clkrstn)

                                                              begin

                                                              50

                                                              if(rstn = 0)then

                                                              rd lt= 0

                                                              red lt= 0

                                                              green lt= 0

                                                              blue lt= 0

                                                              addr_s lt= x0000 amp 00

                                                              elsif (clk = 1 and clkevent)then

                                                              if((v_en=1) and (v_en_reg=0))then

                                                              addr_slt=(others =gt0)

                                                              end if

                                                              if(v_en=1 and h_en=1)then

                                                              if(line_countlt=514)then

                                                              if(c_hs lt= 655)then

                                                              rd lt= 1

                                                              red lt= data_in

                                                              green lt= data_in

                                                              blue lt= data_in

                                                              addr_s lt=addr_s +000000000000000001

                                                              else

                                                              rd lt= 0

                                                              red lt= 0

                                                              green lt= 1

                                                              blue lt= 0

                                                              end if

                                                              else

                                                              rd lt= 0

                                                              red lt= 1

                                                              green lt= 1

                                                              blue lt= 1

                                                              51

                                                              end if

                                                              if(addr_s=111011111111111111)then

                                                              addr_slt= (others =gt0)

                                                              end if

                                                              else

                                                              red lt= 0

                                                              green lt= 0

                                                              blue lt= 0

                                                              end if

                                                              end if

                                                              -- end if

                                                              --end if

                                                              end process

                                                              end behv

                                                              52

                                                              12 FPGA ARCHITECTURE

                                                              A field-programmable gate array (FPGA) is a semiconductor device that can

                                                              be configured by the customer or designer after manufacturingmdashhence the

                                                              name field-programmable FPGAs are programmed using a logic circuit

                                                              diagram or a source code in a hardware description language (HDL) to specify

                                                              how the chip will work They can be used to implement any logical function

                                                              that an application-specific integrated circuit (ASIC) could perform but the

                                                              ability to update the functionality after shipping offers advantages for many

                                                              applications

                                                              FPGAs contain programmable logic components called logic blocks and a

                                                              hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                              togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                              can be configured to perform complex combinational functions or merely

                                                              simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                              include memory elements which may be simple flip-flops or more complete

                                                              blocks of memory

                                                              History The FPGA industry sprouted from programmable read only

                                                              memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                              both had the option of being programmed in batches in a factory or in the field

                                                              (field programmable) however programmable logic was hard-wired between

                                                              logic gates

                                                              Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                              first commercially viable field programmable gate array in 1985 ndash the

                                                              XC2064 The XC2064 had programmable gates and programmable

                                                              interconnects between gates the beginnings of a new technology and market

                                                              The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                              input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                              into the National Inventors Hall of Fame for his invention

                                                              53

                                                              Some of the industryrsquos foundational concepts and technologies for

                                                              programmable logic arrays gates and logic blocks are founded in patents

                                                              awarded to David W Page and LuVerne R Peterson in 1985

                                                              In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                              proposed by Steve Casselman to develop a computer that would implement

                                                              600000 reprogrammable gates Casselman was successful and the system was

                                                              awarded a patent in 1992

                                                              Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                              1990s when competitors sprouted up eroding significant market-share By

                                                              1993 Actel was serving about 18 percent of the market

                                                              The 1990s were an explosive period of time for FPGAs both in sophistication

                                                              and the volume of production In the early 1990s FPGAs were primarily used

                                                              in telecommunications and networking By the end of the decade FPGAs

                                                              found their way into consumer automotive and industrial applications

                                                              FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                              algorithm technology and FPGAs to create a sound recognition device

                                                              Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                              to decide the configuration needed to accomplish a sound recognition task

                                                              121 Modern developments

                                                              A recent trend has been to take the coarse-grained architectural

                                                              approach a step further by combining the logic blocks and interconnects of

                                                              traditional FPGAs with embedded microprocessors and related peripherals to

                                                              form a complete system on a programmable chip This work mirrors the

                                                              architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                              Group which combined a reconfigurable CPU architecture on a single chip

                                                              called the SB24 That work was done in 1982 Examples of such hybrid

                                                              technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                              which include one or more PowerPC processors embedded within the FPGAs

                                                              54

                                                              logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                              processor in combination with Atmels programmable logic architecture

                                                              An alternate approach to using hard-macro processors is to make use of soft

                                                              processor cores that are implemented within the FPGA logic (See Soft

                                                              processors below)

                                                              As previously mentioned many modern FPGAs have the ability to be

                                                              reprogrammed at run time and this is leading to the idea of reconfigurable

                                                              computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                              suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                              example of a reconfigurable soft processor implemented on FPGAs However

                                                              it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                              to a specific program

                                                              Additionally new non-FPGA architectures are beginning to emerge

                                                              Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                              hybrid approach by providing an array of processor cores and FPGA-like

                                                              programmable cores on the same chip

                                                              Gates

                                                              1987 9000 gates Xilinx

                                                              1992 600000 Naval Surface Warfare Department

                                                              Early 2000s Millions

                                                              Market size

                                                              1985 First commercial FPGA technology invented by Xilinx

                                                              1987 $14 million

                                                              ~1993 gt$385 million

                                                              2005 $19 billion

                                                              2010 estimates $275 billion

                                                              55

                                                              122 FPGA Comparisons

                                                              Historically FPGAs have been slower less energy efficient and generally

                                                              achieved less functionality than their fixed ASIC counterparts A combination

                                                              of volume fabrication improvements research and development and the IO

                                                              capabilities of new supercomputers have largely closed the performance gap

                                                              between ASICs and FPGAs

                                                              Advantages include a shorter time to market ability to re-program in the field

                                                              to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                              middle road by developing their hardware on ordinary FPGAs but

                                                              manufacture their final version so it can no longer be modified after the design

                                                              has been committed

                                                              Xilinx claims that several market and technology dynamics are changing the

                                                              ASICFPGA paradigm

                                                              IC costs are rising aggressively

                                                              ASIC complexity has bolstered development time and costs

                                                              RampD resources and headcount is decreasing

                                                              Revenue losses for slow time-to-market are increasing

                                                              Financial constraints in a poor economy are driving low-cost technologies

                                                              These trends make FPGAs a better alternative than ASICs for a growing

                                                              number of higher-volume applications than they have been historically used

                                                              for which the company blames for the growing number of FPGA design starts

                                                              (see History)

                                                              The primary differences between CPLDs and FPGAs are architectural A

                                                              CPLD has a somewhat restrictive structure consisting of one or more

                                                              programmable sum-of-products logic arrays feeding a relatively small number

                                                              of clocked registers The result of this is less flexibility with the advantage of

                                                              more predictable timing delays and a higher logic-to-interconnect ratio The

                                                              FPGA architectures on the other hand are dominated by interconnect This

                                                              56

                                                              makes them far more flexible (in terms of the range of designs that are

                                                              practical for implementation within them) but also far more complex to design

                                                              for

                                                              Another notable difference between CPLDs and FPGAs is the presence in most

                                                              FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                              and embedded memories as well as to have logic blocks implement decoders

                                                              or mathematical functions

                                                              Some FPGAs have the capability of partial re-configuration that lets one

                                                              portion of the device be re-programmed while other portions continue running

                                                              123 Applications

                                                              Applications of FPGAs include digital signal processing software-defined

                                                              radio aerospace and defense systems ASIC prototyping medical imaging

                                                              computer vision speech recognition cryptography bioinformatics computer

                                                              hardware emulation radio astronomy and a growing range of other areas

                                                              FPGAs originally began as competitors to CPLDs and competed in a similar

                                                              space that of glue logic for PCBs As their size capabilities and speed

                                                              increased they began to take over larger and larger functions to the state where

                                                              some are now marketed as full systems on chips (SoC) Particularly with the

                                                              introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                              applications which had traditionally been the sole reserve of DSPs began to

                                                              incorporate FPGAs instead

                                                              FPGAs especially find applications in any area or algorithm that can make use

                                                              of the massive parallelism offered by their architecture One such area is code

                                                              breaking in particular brute-force attack of cryptographic algorithms

                                                              FPGAs are increasingly used in conventional high performance computing

                                                              applications where computational kernels such as FFT or Convolution are

                                                              performed on the FPGA instead of a microprocessor

                                                              57

                                                              The inherent parallelism of the logic resources on an FPGA allows for

                                                              considerable computational throughput even at a low MHz clock rates The

                                                              flexibility of the FPGA allows for even higher performance by trading off

                                                              precision and range in the number format for an increased number of parallel

                                                              arithmetic units This has driven a new type of processing called

                                                              reconfigurable computing where time intensive tasks are offloaded from

                                                              software to FPGAs

                                                              The adoption of FPGAs in high performance computing is currently limited by

                                                              the complexity of FPGA design compared to conventional software and the

                                                              extremely long turn-around times of current design tools where 4-8 hours wait

                                                              is necessary after even minor changes to the source code

                                                              Traditionally FPGAs have been reserved for specific vertical applications

                                                              where the volume of production is small For these low-volume applications

                                                              the premium that companies pay in hardware costs per unit for a

                                                              programmable chip is more affordable than the development resources spent

                                                              on creating an ASIC for a low-volume application Today new cost and

                                                              performance dynamics have broadened the range of viable applications

                                                              Architecture

                                                              The most common FPGA architecture consists of an array of configurable

                                                              logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                              channels have the same width (number of wires) Multiple IO pads may fit

                                                              into the height of one row or the width of one column in the array

                                                              An application circuit must be mapped into an FPGA with adequate resources

                                                              While the number of CLBs and IOs required is easily determined from the

                                                              design the number of routing tracks needed may vary considerably even

                                                              among designs with the same amount of logic (For example a crossbar switch

                                                              requires much more routing than a systolic array with the same gate count)

                                                              Since unused routing tracks increase the cost (and decrease the performance)

                                                              of the part without providing any benefit FPGA manufacturers try to provide

                                                              58

                                                              just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                              can be routed This is determined by estimates such as those derived from

                                                              Rents rule or by experiments with existing designs

                                                              The FPGA is an array or island-style FPGA It consists of an array of logic

                                                              blocks and routing channels Two IO pads fit into the height of one row or the

                                                              width of one column as shown below All the routing channels have the same

                                                              width (number of wires)

                                                              121 FPGA structure

                                                              A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                              flip-flop as shown below In recent years manufacturers have started moving

                                                              to 6-input LUTs in their high performance parts claiming increased

                                                              performance

                                                              Typical logic block

                                                              59

                                                              There is only one output which can be either the registered or the unregistered

                                                              LUT output The logic block has four inputs for the LUT and a clock input

                                                              Since clock signals (and often other high-fanout signals) are normally routed

                                                              via special-purpose dedicated routing networks in commercial FPGAs they

                                                              and other signals are separately managed

                                                              For this example architecture the locations of the FPGA logic block pins are

                                                              shown below

                                                              Logic Block Pin Locations

                                                              Each input is accessible from one side of the logic block while the output pin

                                                              can connect to routing wires in both the channel to the right and the channel

                                                              below the logic block

                                                              Each logic block output pin can connect to any of the wiring segments in the

                                                              channels adjacent to it

                                                              Similarly an IO pad can connect to any one of the wiring segments in the

                                                              channel adjacent to it For example an IO pad at the top of the chip can

                                                              connect to any of the W wires (where W is the channel width) in the horizontal

                                                              channel immediately below it

                                                              Generally the FPGA routing is unsegmented That is each wiring segment

                                                              spans only one logic block before it terminates in a switch box By turning on

                                                              60

                                                              some of the programmable switches within a switch box longer paths can be

                                                              constructed For higher speed interconnect some FPGA architectures use

                                                              longer routing lines that span multiple logic blocks

                                                              Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                              this architecture when a wire enters a switch box there are three

                                                              programmable switches that allow it to connect to three other wires in adjacent

                                                              channel segments The pattern or topology of switches used in this

                                                              architecture is the planar or domain-based switch box topology In this switch

                                                              box topology a wire in track number one connects only to wires in track

                                                              number one in adjacent channel segments wires in track number 2 connect

                                                              only to other wires in track number 2 and so on The figure below illustrates

                                                              the connections in a switch box

                                                              Switch box topology

                                                              Modern FPGA families expand upon the above capabilities to include higher

                                                              level functionality fixed into the silicon Having these common functions

                                                              embedded into the silicon reduces the area required and gives those functions

                                                              increased speed compared to building them from primitives Examples of these

                                                              include multipliers generic DSP blocks embedded processors high speed IO

                                                              logic and embedded memories

                                                              61

                                                              FPGAs are also widely used for systems validation including pre-silicon

                                                              validation post-silicon validation and firmware development This allows chip

                                                              companies to validate their design before the chip is produced in the factory

                                                              reducing the time to market

                                                              124 FPGA Design and Programming

                                                              To define the behavior of the FPGA the user provides a hardware description

                                                              language (HDL) or a schematic design The HDL form might be easier to work

                                                              with when handling large structures because its possible to just specify them

                                                              numerically rather than having to draw every piece by hand On the other hand

                                                              schematic entry can allow for easier visualisation of a design

                                                              Then using an electronic design automation tool a technology-mapped netlist

                                                              is generated The netlist can then be fitted to the actual FPGA architecture

                                                              using a process called place-and-route usually performed by the FPGA

                                                              companys proprietary place-and-route software The user will validate the

                                                              map place and route results via timing analysis simulation and other

                                                              verification methodologies Once the design and validation process is

                                                              complete the binary file generated (also using the FPGA companys

                                                              proprietary software) is used to (re)configure the FPGA

                                                              Going from schematicHDL source files to actual configuration The source

                                                              files are fed to a software suite from the FPGACPLD vendor that through

                                                              different steps will produce a file This file is then transferred to the

                                                              FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                              like an EEPROM

                                                              The most common HDLs are VHDL and Verilog although in an attempt to

                                                              reduce the complexity of designing in HDLs which have been compared to the

                                                              equivalent of assembly languages there are moves to raise the abstraction level

                                                              through the introduction of alternative languages

                                                              62

                                                              To simplify the design of complex systems in FPGAs there exist libraries of

                                                              predefined complex functions and circuits that have been tested and optimized

                                                              to speed up the design process These predefined circuits are commonly called

                                                              IP cores and are available from FPGA vendors and third-party IP suppliers

                                                              (rarely free and typically released under proprietary licenses) Other

                                                              predefined circuits are available from developer communities such as Open

                                                              Cores (typically free and released under the GPL BSD or similar license) and

                                                              other sources

                                                              In a typical design flow an FPGA application developer will simulate the

                                                              design at multiple stages throughout the design process Initially the RTL

                                                              description in VHDL or Verilog is simulated by creating test benches to

                                                              simulate the system and observe results Then after the synthesis engine has

                                                              mapped the design to a netlist the netlist is translated to a gate level

                                                              description where simulation is repeated to confirm the synthesis proceeded

                                                              without errors Finally the design is laid out in the FPGA at which point

                                                              propagation delays can be added and the simulation run again with these

                                                              values back-annotated onto the netlist

                                                              Basic Process Technology Types

                                                              SRAM - based on static memory technology In-system programmable and re-

                                                              programmable Requires external boot devices CMOS

                                                              Antifuse - One-time programmable CMOS

                                                              EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                              one-time programmable in production because of plastic packaging

                                                              Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                              EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                              technology Can be erased even in plastic packages Some but not all

                                                              EEPROM devices can be in-system programmed CMOS

                                                              Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                              packages Some but not all flash devices can be in-system programmed

                                                              63

                                                              Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                              therefore less expensive to manufacture CMOS

                                                              Fuse - One-time programmable Bipolar

                                                              Major Manufacturers

                                                              Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                              rivals Together they control over 80 percent of the market with Xilinx alone

                                                              representing over 50 percent

                                                              Xilinx also provides free Windows and Linux design software while Altera

                                                              provides free Windows tools the Solaris and Linux tools are only available via

                                                              a rental scheme

                                                              Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                              (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                              Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                              focused CSSP no general purpose FPGAs)

                                                              125 FPGA prototype

                                                              FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                              prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                              hardware verification and early software development

                                                              Main stream verification methods for hardware design and early software and

                                                              firmware co-design has become mainstream Prototyping SoC and ASIC

                                                              design on FPGA has become a good method to do this

                                                              64

                                                              Reasons why Prototyping is important

                                                              1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                              functionally correct This is compared to designers only relying on software

                                                              simulations to verify that their hardware design is sound Simulation speed and

                                                              modeling accuracy limitations hinder this development

                                                              2 Due to time constrains many projects cannot wait until the silicon is back from

                                                              the foundry to start on software tests FPGA prototyping allows for much more

                                                              time in area of software development and testing at the software-hardware

                                                              integration stage This allows many unforeseen software bugs that appear due

                                                              to todays array of operating systems applications and hardware

                                                              3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                              system work well together off the simulation stage and in actual form

                                                              4 Prototyping has the added advantage as demo platforms to SoC clients

                                                              bringing in interest early This speeds up the overall development cycle and

                                                              allows for more enhancement or improvement to the chip features as it would

                                                              otherwise have been

                                                              65

                                                              13 SIMULATION RESULTS

                                                              66

                                                              GATE LEVEL

                                                              FigVGA CONTROLLER

                                                              67

                                                              GATE LEVEL

                                                              Fig SYNC UNIT

                                                              68

                                                              TECHNOLOGY SCHEMATIC

                                                              69

                                                              14 APPLICATIONS

                                                              Motion Tracking Systems

                                                              Image processing systems

                                                              Displaying Systems

                                                              141 CONCLUTION

                                                              Hence by implementing FPGA taking a processor-generated picture

                                                              (pixilated) from memory space the Controller provides digital RGB values for

                                                              each pixel as well as horizontal and vertical synchronization signals in order

                                                              to correctly display the picture on a connected monitor

                                                              70

                                                              REFERENCES

                                                              1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                              2 A VGA display controller by Eduardo Sanchez

                                                              3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                              71

                                                              • Pixel
                                                                • 6 SIGNALS IN VGA CONTROLLER
                                                                • Gates
                                                                • Market size
                                                                • 122 FPGA Comparisons

                                                                32

                                                                10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                                Why (V) HDL

                                                                Interoperability

                                                                Technology independence

                                                                Design reuse

                                                                Several levels of abstraction

                                                                Readability

                                                                Standard language

                                                                Widely supported

                                                                What is VHDL

                                                                VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                                IC)

                                                                Design specification language

                                                                Design entry language

                                                                Design simulation language

                                                                Design documentation language

                                                                An alternative to schematics

                                                                101 Brief History

                                                                VHDL Was developed in the early 1980s for managing design problems that

                                                                involved large circuits and multiple teams of engineers

                                                                Funded by US Department of Defence

                                                                33

                                                                The first publicly available version was released in 1985

                                                                In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                                presented with a proposal to standardize the VHDL

                                                                In 1987 standardization =gt IEEE 1076-1987

                                                                An improved version of the language was released in 1994 =gt IEEE

                                                                standard1076-1993

                                                                102 Related Standards

                                                                IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                                impedance

                                                                Soon after IEEE 1076-1987 was released simulator companies began using

                                                                their own non-standard types =gt VHDL was becoming a nonstandard

                                                                IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                                definitions for a nine-valued data type std_logic

                                                                IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                                to actual hardware

                                                                Defines eg two numeric types signed and unsigned

                                                                VHDL Environment

                                                                Design Units

                                                                34

                                                                Segments of VHDL code that can be compiled separately and stored in a

                                                                library

                                                                Entities

                                                                A black box with interface definition

                                                                Defines the inputsoutputs of a component (define pins)

                                                                A way to represent modularity in VHDL

                                                                Similar to symbol in schematic

                                                                Entity declaration describes entity

                                                                Eg

                                                                entity Comparator is

                                                                port (A B in std_logic_vector(7 downto0)

                                                                EQ out std_logic)

                                                                end Comparator

                                                                Ports Provide channels of communication between the component and its

                                                                environment

                                                                Each port must have a name direction and a type

                                                                An entity may have NO port declaration

                                                                35

                                                                Port directions In A value of a port can be read inside the component but cannot be assigned

                                                                Multiple reads of port are allowed

                                                                Out Assignments can be made to a port but data from a port cannot be read

                                                                Multiple assignments are allowed

                                                                In out Bi-directional assignments can be made and data can be read Multiple

                                                                assignments are allowed

                                                                Buffer An out port with read capability May have at most one assignment

                                                                (are not recommended)

                                                                Architectures Every entity has at least one architecture

                                                                One entity can have several architectures

                                                                Architectures can describe design using

                                                                BehaviorndashStructurendashDataflow

                                                                Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                                Transfer Level)ndashBehavioral level

                                                                Configuration declaration links architecture to entity

                                                                Eg

                                                                Architecture Comparator1 of Comparator is

                                                                Begin

                                                                EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                                End Comparator1

                                                                Configurations Links entity declaration and architecture body together

                                                                Concept of default configuration is a bit messy in VHDL lsquo87

                                                                ndashLast architecture analyzed links to entity

                                                                Can be used to change simulation behavior without re-analyzing the VHDL

                                                                source

                                                                Complex configuration declarations are ignored in synthesis

                                                                36

                                                                Some entities can have eggate level architecture and behavioral

                                                                architecture

                                                                Are always optional

                                                                PackagesPackages contain information common to many design units

                                                                1 Package declaration

                                                                --constant declarations

                                                                ndashtype and subtype declarations

                                                                ndashfunction and procedure declarations

                                                                ndashglobal signal declarations

                                                                ndashfile declarations

                                                                ndashcomponent declarations

                                                                2 Package body

                                                                ndashis not necessary needed

                                                                ndashfunction bodies

                                                                ndashprocedure bodies

                                                                Packages are meant for encapsuling data which can be shared globally among

                                                                Several design units

                                                                Consists of declaration part and optional body part

                                                                Package declaration can contain

                                                                ndashtype and subtype declarations

                                                                ndashsubprograms

                                                                ndashconstants

                                                                Alias declarations

                                                                ndashglobal signal declarations

                                                                ndashfile declarations

                                                                ndashcomponent declarations

                                                                37

                                                                Package body consists of

                                                                ndashsubprogram declarations and bodies

                                                                ndashtype and subtype declarations

                                                                ndash deferred constants

                                                                ndash file declarations

                                                                LibrariesCollection of VHDL design units (database)

                                                                1 Packages

                                                                package declaration

                                                                package body

                                                                2 Entities (entity declaration)

                                                                3 Architectures (architecture body)

                                                                4 Configurations (configuration declarations)

                                                                Usually directory in UNIX file system

                                                                Can be also any other kind of database

                                                                Levels of Abstraction

                                                                VHDL supports many possible styles of design description which differ

                                                                primarily in how closely they relate to the HW

                                                                It is possible to describe a circuit in a number of ways

                                                                Structural-------

                                                                Dataflow ------- Higher level of abstraction

                                                                Behavioral -------

                                                                Structural VHDL description

                                                                Circuit is described in terms of its components

                                                                From a low-level description (eg transistor-level description) to a high level

                                                                description (eg block diagram)

                                                                For large circuits a low-level description quickly becomes impractical

                                                                103 Dataflow VHDL Description

                                                                38

                                                                Circuit is described in terms of how data moves through the system

                                                                In the dataflow style you describe how information flows between registers in

                                                                the system

                                                                The combinational logic is described at a relatively high level the placement

                                                                and operation of registers is specified quite precisely

                                                                The behavior of the system over the time is defined by registers

                                                                There are no build-in registers in VHDL-language

                                                                ndashEither lower level description

                                                                ndashor behavioral description of sequential elements is needed

                                                                The lower level register descriptions must be created or obtained

                                                                If there is no 3rd party models for registers =gt you must write the behavioral

                                                                description of registers

                                                                The behavioral description can be provided in the form of

                                                                subprograms(functions or procedures)

                                                                104 Behavioral VHDL Description

                                                                Circuit is described in terms of its operation over time

                                                                39

                                                                Representation might include eg state diagrams timing diagrams and

                                                                algorithmic descriptions

                                                                The concept of time may be expressed precisely using delays (eg A lt= B

                                                                after 10 ns)

                                                                If no actual delay is used order of sequential operations is defined

                                                                In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                                timing specifications

                                                                The actual timing results depend on implementation technology and efficiency

                                                                of synthesis tool

                                                                There are a few tools for behavioral synthesis

                                                                Concurrent Vs Sequential

                                                                Processes

                                                                Basic simulation concept in VHDL

                                                                VHDL description can always be broken up to interconnected processes

                                                                Quite similar to Unix process

                                                                40

                                                                Process keyword in VHDL

                                                                Process statement is concurrent statement

                                                                Statements inside process statements are sequential statements

                                                                Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                General format

                                                                Process [(sensitivity_list)]

                                                                process_declarative_part

                                                                Begin

                                                                process_statements

                                                                [wait_statement]

                                                                End process

                                                                41

                                                                11 VGA CONTROLLER CODE

                                                                library IEEE

                                                                use IEEESTD_LOGIC_1164ALL

                                                                use IEEESTD_LOGIC_ARITHALL

                                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                                use IEEEnumeric_stdALL

                                                                ---- Uncomment the following library declaration if instantiating

                                                                ---- any Xilinx primitives in this code

                                                                --library UNISIM

                                                                --use UNISIMVComponentsall

                                                                entity vga_controller is port(clk50min std_logic

                                                                rstnin std_logic

                                                                hsyncout std_logic

                                                                vsyncout std_logic

                                                                redout std_logic_vector(0 downto 0)

                                                                greenout std_logic_vector(0 downto 0)

                                                                blueout std_logic_vector(0 downto 0))

                                                                end vga_controller

                                                                architecture Behavioral of vga_controller is

                                                                component memory

                                                                port (

                                                                addr IN std_logic_VECTOR(17 downto 0)

                                                                clk IN std_logic

                                                                dout OUT std_logic_VECTOR(0 downto 0)

                                                                en IN std_logic

                                                                sinit IN std_logic)

                                                                END component

                                                                42

                                                                component sync_unit port(

                                                                clkin std_logic

                                                                rstnin std_logic

                                                                hsyncout std_logic

                                                                ---horizontal synch pulse(row)

                                                                vsyncout std_logic

                                                                --vertical sync pulse(frame) 111110100000000000

                                                                v_en_regout std_logic

                                                                h_enout std_logic

                                                                v_enout std_logic

                                                                line_countinout integer

                                                                c_hsinout integer)

                                                                end component

                                                                component pixelgen port(

                                                                clkin std_logic

                                                                rstnin std_logic

                                                                hsyncin std_logic

                                                                vsyncin std_logic

                                                                h_enin std_logic

                                                                v_enin std_logic

                                                                v_en_reg in std_logic

                                                                data_inin std_logic_vector(0 downto 0)

                                                                line_countin integer

                                                                c_hsin integer

                                                                red out std_logic_vector(0 downto 0)

                                                                greenout std_logic_vector(0 downto 0)

                                                                blueout std_logic_vector(0 downto 0)

                                                                addrout std_logic_vector(17 downto 0)

                                                                rdout std_logic)

                                                                end component

                                                                43

                                                                signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                signal line_count_sc_hs_saddrinteger

                                                                signal data_sstd_logic_vector(0 downto 0)

                                                                signal addr_sstd_logic_vector(17 downto 0)

                                                                begin

                                                                --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                process(clk50mrstn)

                                                                begin

                                                                if(rstn = 0)then

                                                                clk lt= 0

                                                                elsif(clk50m = 1 and clk50mevent)then

                                                                clk lt= not clk

                                                                end if

                                                                --end if

                                                                end process

                                                                hsync lt= hsync_s

                                                                vsync lt= vsync_s

                                                                sync sync_unit port map(

                                                                clk =gt clk

                                                                rstn =gt rstn

                                                                hsync =gt hsync_s

                                                                vsync =gt vsync_s

                                                                v_en_reg =gt v_en_reg_s

                                                                line_count =gt line_count_s

                                                                h_en =gt h_en_s

                                                                v_en =gt v_en_s

                                                                c_hs =gt c_hs_s

                                                                )

                                                                44

                                                                pixel pixelgen port map(

                                                                clk =gt clk

                                                                rstn =gt rstn

                                                                hsync =gt hsync_s

                                                                vsync =gt vsync_s

                                                                h_en =gt h_en_s

                                                                v_en =gt v_en_s

                                                                v_en_reg =gt v_en_reg_s

                                                                data_in =gt data_s

                                                                line_count=gt line_count_s

                                                                c_hs =gt c_hs_s

                                                                red =gt red

                                                                green =gt green

                                                                blue =gt blue

                                                                addr =gt addr_s

                                                                rd =gt rd_s

                                                                )

                                                                mem12 memory port map(

                                                                addr =gt addr_s

                                                                clk =gt clk

                                                                dout =gt data_s

                                                                en =gt rd_s

                                                                sinit=gt rstn

                                                                )

                                                                end Behavioral

                                                                45

                                                                SYNCHRONIZATION BLOCK CODE

                                                                library IEEE

                                                                use IEEESTD_LOGIC_1164ALL

                                                                use IEEESTD_LOGIC_ARITHALL

                                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                                entity sync_unit is port(

                                                                clkin std_logic

                                                                rstnin std_logic

                                                                hsyncout std_logic ---horizontal synch pulse(row)

                                                                vsyncout std_logic --vertical sync

                                                                pulse(frame) 111110100000000000

                                                                v_en_regout std_logic

                                                                h_enout std_logic

                                                                v_enout std_logic

                                                                line_countout integer

                                                                c_hsout integer )

                                                                end sync_unit

                                                                architecture arch_sync_unit of sync_unit is

                                                                signal h_en_sv_en_sstd_logic

                                                                signal line_count_sc_hs_sinteger

                                                                begin

                                                                c_hs lt= c_hs_s

                                                                line_count lt= line_count_s

                                                                h_en lt= h_en_s

                                                                v_en lt= v_en_s

                                                                -- hsync counter

                                                                process(clkrstn)

                                                                begin

                                                                if(rstn = 0)then

                                                                46

                                                                c_hs_s lt= 0

                                                                else

                                                                if(clk = 1 and clkevent)then

                                                                if(c_hs_s=793)then

                                                                c_hs_s lt= 0

                                                                else

                                                                c_hs_s lt= c_hs_s +1

                                                                end if

                                                                end if

                                                                end if

                                                                end process

                                                                --vsync counter

                                                                --vertical line counter

                                                                process(h_en_srstn)

                                                                begin

                                                                if(rstn = 0)then

                                                                line_count_s lt= 0

                                                                elsif(h_en_s = 1 and h_en_sevent)then

                                                                line_count_s lt= line_count_s + 1

                                                                if (line_count_s=515)then

                                                                line_count_s lt= 0

                                                                end if

                                                                end if

                                                                --end if

                                                                end process

                                                                --hysnc pulse

                                                                process(clkrstn)

                                                                begin

                                                                if(rstn = 0)then

                                                                47

                                                                hsync lt= 1

                                                                elsif(clk = 1 and clkevent)then

                                                                if(c_hs_s lt=95)then

                                                                hsync lt= 0

                                                                else

                                                                hsync lt= 1

                                                                end if

                                                                end if

                                                                --end if

                                                                end process

                                                                process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                this period)

                                                                begin

                                                                if(rstn = 0)then

                                                                h_en_s lt= 0

                                                                elsif(clk = 1 and clkevent)then

                                                                if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                h_en_s lt= 1

                                                                else

                                                                h_en_s lt= 0

                                                                end if

                                                                end if

                                                                --end if

                                                                end process

                                                                ---vysnc pulse

                                                                process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                begin

                                                                if(rstn = 0)then

                                                                48

                                                                v_en_s lt= 0

                                                                elsif(clk = 1 and clkevent)then

                                                                if((line_count_sgt=35 and line_count_slt=515))then

                                                                v_en_s lt= 1

                                                                else

                                                                v_en_s lt= 0

                                                                end if

                                                                end if

                                                                --end if

                                                                end process

                                                                process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                begin

                                                                if(rstn = 0)then

                                                                vsync lt= 1

                                                                elsif(clk = 1 and clkevent)then

                                                                if(line_count_s lt2)then

                                                                vsync lt= 0

                                                                else

                                                                vsync lt= 1

                                                                end if

                                                                end if

                                                                --end if

                                                                end process

                                                                process(clk)

                                                                begin

                                                                if(clk = 1 and clkevent)then

                                                                v_en_reglt=v_en_s

                                                                end if

                                                                end process

                                                                end arch_sync_unit

                                                                49

                                                                PIXEL GENERATOR

                                                                library IEEE

                                                                use IEEESTD_LOGIC_1164ALL

                                                                use IEEESTD_LOGIC_ARITHALL

                                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                                entity pixelgen is port(

                                                                clkin std_logic

                                                                rstnin std_logic

                                                                hsyncin std_logic

                                                                vsyncin std_logic

                                                                h_enin std_logic

                                                                v_enin std_logic

                                                                v_en_reg in std_logic

                                                                data_inin std_logic_vector(0 downto 0)

                                                                line_countin integer

                                                                c_hsin integer

                                                                red out std_logic_vector(0 downto 0)

                                                                greenout std_logic_vector(0 downto 0)

                                                                blueout std_logic_vector(0 downto 0)

                                                                addrout std_logic_vector(17 downto 0)

                                                                rdout std_logic

                                                                )

                                                                end pixelgen

                                                                architecture behv of pixelgen is

                                                                signal addr_sstd_logic_vector(17 downto 0)

                                                                begin

                                                                addr lt= addr_s

                                                                process(clkrstn)

                                                                begin

                                                                50

                                                                if(rstn = 0)then

                                                                rd lt= 0

                                                                red lt= 0

                                                                green lt= 0

                                                                blue lt= 0

                                                                addr_s lt= x0000 amp 00

                                                                elsif (clk = 1 and clkevent)then

                                                                if((v_en=1) and (v_en_reg=0))then

                                                                addr_slt=(others =gt0)

                                                                end if

                                                                if(v_en=1 and h_en=1)then

                                                                if(line_countlt=514)then

                                                                if(c_hs lt= 655)then

                                                                rd lt= 1

                                                                red lt= data_in

                                                                green lt= data_in

                                                                blue lt= data_in

                                                                addr_s lt=addr_s +000000000000000001

                                                                else

                                                                rd lt= 0

                                                                red lt= 0

                                                                green lt= 1

                                                                blue lt= 0

                                                                end if

                                                                else

                                                                rd lt= 0

                                                                red lt= 1

                                                                green lt= 1

                                                                blue lt= 1

                                                                51

                                                                end if

                                                                if(addr_s=111011111111111111)then

                                                                addr_slt= (others =gt0)

                                                                end if

                                                                else

                                                                red lt= 0

                                                                green lt= 0

                                                                blue lt= 0

                                                                end if

                                                                end if

                                                                -- end if

                                                                --end if

                                                                end process

                                                                end behv

                                                                52

                                                                12 FPGA ARCHITECTURE

                                                                A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                be configured by the customer or designer after manufacturingmdashhence the

                                                                name field-programmable FPGAs are programmed using a logic circuit

                                                                diagram or a source code in a hardware description language (HDL) to specify

                                                                how the chip will work They can be used to implement any logical function

                                                                that an application-specific integrated circuit (ASIC) could perform but the

                                                                ability to update the functionality after shipping offers advantages for many

                                                                applications

                                                                FPGAs contain programmable logic components called logic blocks and a

                                                                hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                can be configured to perform complex combinational functions or merely

                                                                simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                include memory elements which may be simple flip-flops or more complete

                                                                blocks of memory

                                                                History The FPGA industry sprouted from programmable read only

                                                                memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                both had the option of being programmed in batches in a factory or in the field

                                                                (field programmable) however programmable logic was hard-wired between

                                                                logic gates

                                                                Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                first commercially viable field programmable gate array in 1985 ndash the

                                                                XC2064 The XC2064 had programmable gates and programmable

                                                                interconnects between gates the beginnings of a new technology and market

                                                                The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                into the National Inventors Hall of Fame for his invention

                                                                53

                                                                Some of the industryrsquos foundational concepts and technologies for

                                                                programmable logic arrays gates and logic blocks are founded in patents

                                                                awarded to David W Page and LuVerne R Peterson in 1985

                                                                In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                proposed by Steve Casselman to develop a computer that would implement

                                                                600000 reprogrammable gates Casselman was successful and the system was

                                                                awarded a patent in 1992

                                                                Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                1990s when competitors sprouted up eroding significant market-share By

                                                                1993 Actel was serving about 18 percent of the market

                                                                The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                and the volume of production In the early 1990s FPGAs were primarily used

                                                                in telecommunications and networking By the end of the decade FPGAs

                                                                found their way into consumer automotive and industrial applications

                                                                FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                algorithm technology and FPGAs to create a sound recognition device

                                                                Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                to decide the configuration needed to accomplish a sound recognition task

                                                                121 Modern developments

                                                                A recent trend has been to take the coarse-grained architectural

                                                                approach a step further by combining the logic blocks and interconnects of

                                                                traditional FPGAs with embedded microprocessors and related peripherals to

                                                                form a complete system on a programmable chip This work mirrors the

                                                                architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                Group which combined a reconfigurable CPU architecture on a single chip

                                                                called the SB24 That work was done in 1982 Examples of such hybrid

                                                                technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                which include one or more PowerPC processors embedded within the FPGAs

                                                                54

                                                                logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                processor in combination with Atmels programmable logic architecture

                                                                An alternate approach to using hard-macro processors is to make use of soft

                                                                processor cores that are implemented within the FPGA logic (See Soft

                                                                processors below)

                                                                As previously mentioned many modern FPGAs have the ability to be

                                                                reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                example of a reconfigurable soft processor implemented on FPGAs However

                                                                it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                to a specific program

                                                                Additionally new non-FPGA architectures are beginning to emerge

                                                                Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                hybrid approach by providing an array of processor cores and FPGA-like

                                                                programmable cores on the same chip

                                                                Gates

                                                                1987 9000 gates Xilinx

                                                                1992 600000 Naval Surface Warfare Department

                                                                Early 2000s Millions

                                                                Market size

                                                                1985 First commercial FPGA technology invented by Xilinx

                                                                1987 $14 million

                                                                ~1993 gt$385 million

                                                                2005 $19 billion

                                                                2010 estimates $275 billion

                                                                55

                                                                122 FPGA Comparisons

                                                                Historically FPGAs have been slower less energy efficient and generally

                                                                achieved less functionality than their fixed ASIC counterparts A combination

                                                                of volume fabrication improvements research and development and the IO

                                                                capabilities of new supercomputers have largely closed the performance gap

                                                                between ASICs and FPGAs

                                                                Advantages include a shorter time to market ability to re-program in the field

                                                                to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                middle road by developing their hardware on ordinary FPGAs but

                                                                manufacture their final version so it can no longer be modified after the design

                                                                has been committed

                                                                Xilinx claims that several market and technology dynamics are changing the

                                                                ASICFPGA paradigm

                                                                IC costs are rising aggressively

                                                                ASIC complexity has bolstered development time and costs

                                                                RampD resources and headcount is decreasing

                                                                Revenue losses for slow time-to-market are increasing

                                                                Financial constraints in a poor economy are driving low-cost technologies

                                                                These trends make FPGAs a better alternative than ASICs for a growing

                                                                number of higher-volume applications than they have been historically used

                                                                for which the company blames for the growing number of FPGA design starts

                                                                (see History)

                                                                The primary differences between CPLDs and FPGAs are architectural A

                                                                CPLD has a somewhat restrictive structure consisting of one or more

                                                                programmable sum-of-products logic arrays feeding a relatively small number

                                                                of clocked registers The result of this is less flexibility with the advantage of

                                                                more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                FPGA architectures on the other hand are dominated by interconnect This

                                                                56

                                                                makes them far more flexible (in terms of the range of designs that are

                                                                practical for implementation within them) but also far more complex to design

                                                                for

                                                                Another notable difference between CPLDs and FPGAs is the presence in most

                                                                FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                and embedded memories as well as to have logic blocks implement decoders

                                                                or mathematical functions

                                                                Some FPGAs have the capability of partial re-configuration that lets one

                                                                portion of the device be re-programmed while other portions continue running

                                                                123 Applications

                                                                Applications of FPGAs include digital signal processing software-defined

                                                                radio aerospace and defense systems ASIC prototyping medical imaging

                                                                computer vision speech recognition cryptography bioinformatics computer

                                                                hardware emulation radio astronomy and a growing range of other areas

                                                                FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                space that of glue logic for PCBs As their size capabilities and speed

                                                                increased they began to take over larger and larger functions to the state where

                                                                some are now marketed as full systems on chips (SoC) Particularly with the

                                                                introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                applications which had traditionally been the sole reserve of DSPs began to

                                                                incorporate FPGAs instead

                                                                FPGAs especially find applications in any area or algorithm that can make use

                                                                of the massive parallelism offered by their architecture One such area is code

                                                                breaking in particular brute-force attack of cryptographic algorithms

                                                                FPGAs are increasingly used in conventional high performance computing

                                                                applications where computational kernels such as FFT or Convolution are

                                                                performed on the FPGA instead of a microprocessor

                                                                57

                                                                The inherent parallelism of the logic resources on an FPGA allows for

                                                                considerable computational throughput even at a low MHz clock rates The

                                                                flexibility of the FPGA allows for even higher performance by trading off

                                                                precision and range in the number format for an increased number of parallel

                                                                arithmetic units This has driven a new type of processing called

                                                                reconfigurable computing where time intensive tasks are offloaded from

                                                                software to FPGAs

                                                                The adoption of FPGAs in high performance computing is currently limited by

                                                                the complexity of FPGA design compared to conventional software and the

                                                                extremely long turn-around times of current design tools where 4-8 hours wait

                                                                is necessary after even minor changes to the source code

                                                                Traditionally FPGAs have been reserved for specific vertical applications

                                                                where the volume of production is small For these low-volume applications

                                                                the premium that companies pay in hardware costs per unit for a

                                                                programmable chip is more affordable than the development resources spent

                                                                on creating an ASIC for a low-volume application Today new cost and

                                                                performance dynamics have broadened the range of viable applications

                                                                Architecture

                                                                The most common FPGA architecture consists of an array of configurable

                                                                logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                channels have the same width (number of wires) Multiple IO pads may fit

                                                                into the height of one row or the width of one column in the array

                                                                An application circuit must be mapped into an FPGA with adequate resources

                                                                While the number of CLBs and IOs required is easily determined from the

                                                                design the number of routing tracks needed may vary considerably even

                                                                among designs with the same amount of logic (For example a crossbar switch

                                                                requires much more routing than a systolic array with the same gate count)

                                                                Since unused routing tracks increase the cost (and decrease the performance)

                                                                of the part without providing any benefit FPGA manufacturers try to provide

                                                                58

                                                                just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                can be routed This is determined by estimates such as those derived from

                                                                Rents rule or by experiments with existing designs

                                                                The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                blocks and routing channels Two IO pads fit into the height of one row or the

                                                                width of one column as shown below All the routing channels have the same

                                                                width (number of wires)

                                                                121 FPGA structure

                                                                A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                flip-flop as shown below In recent years manufacturers have started moving

                                                                to 6-input LUTs in their high performance parts claiming increased

                                                                performance

                                                                Typical logic block

                                                                59

                                                                There is only one output which can be either the registered or the unregistered

                                                                LUT output The logic block has four inputs for the LUT and a clock input

                                                                Since clock signals (and often other high-fanout signals) are normally routed

                                                                via special-purpose dedicated routing networks in commercial FPGAs they

                                                                and other signals are separately managed

                                                                For this example architecture the locations of the FPGA logic block pins are

                                                                shown below

                                                                Logic Block Pin Locations

                                                                Each input is accessible from one side of the logic block while the output pin

                                                                can connect to routing wires in both the channel to the right and the channel

                                                                below the logic block

                                                                Each logic block output pin can connect to any of the wiring segments in the

                                                                channels adjacent to it

                                                                Similarly an IO pad can connect to any one of the wiring segments in the

                                                                channel adjacent to it For example an IO pad at the top of the chip can

                                                                connect to any of the W wires (where W is the channel width) in the horizontal

                                                                channel immediately below it

                                                                Generally the FPGA routing is unsegmented That is each wiring segment

                                                                spans only one logic block before it terminates in a switch box By turning on

                                                                60

                                                                some of the programmable switches within a switch box longer paths can be

                                                                constructed For higher speed interconnect some FPGA architectures use

                                                                longer routing lines that span multiple logic blocks

                                                                Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                this architecture when a wire enters a switch box there are three

                                                                programmable switches that allow it to connect to three other wires in adjacent

                                                                channel segments The pattern or topology of switches used in this

                                                                architecture is the planar or domain-based switch box topology In this switch

                                                                box topology a wire in track number one connects only to wires in track

                                                                number one in adjacent channel segments wires in track number 2 connect

                                                                only to other wires in track number 2 and so on The figure below illustrates

                                                                the connections in a switch box

                                                                Switch box topology

                                                                Modern FPGA families expand upon the above capabilities to include higher

                                                                level functionality fixed into the silicon Having these common functions

                                                                embedded into the silicon reduces the area required and gives those functions

                                                                increased speed compared to building them from primitives Examples of these

                                                                include multipliers generic DSP blocks embedded processors high speed IO

                                                                logic and embedded memories

                                                                61

                                                                FPGAs are also widely used for systems validation including pre-silicon

                                                                validation post-silicon validation and firmware development This allows chip

                                                                companies to validate their design before the chip is produced in the factory

                                                                reducing the time to market

                                                                124 FPGA Design and Programming

                                                                To define the behavior of the FPGA the user provides a hardware description

                                                                language (HDL) or a schematic design The HDL form might be easier to work

                                                                with when handling large structures because its possible to just specify them

                                                                numerically rather than having to draw every piece by hand On the other hand

                                                                schematic entry can allow for easier visualisation of a design

                                                                Then using an electronic design automation tool a technology-mapped netlist

                                                                is generated The netlist can then be fitted to the actual FPGA architecture

                                                                using a process called place-and-route usually performed by the FPGA

                                                                companys proprietary place-and-route software The user will validate the

                                                                map place and route results via timing analysis simulation and other

                                                                verification methodologies Once the design and validation process is

                                                                complete the binary file generated (also using the FPGA companys

                                                                proprietary software) is used to (re)configure the FPGA

                                                                Going from schematicHDL source files to actual configuration The source

                                                                files are fed to a software suite from the FPGACPLD vendor that through

                                                                different steps will produce a file This file is then transferred to the

                                                                FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                like an EEPROM

                                                                The most common HDLs are VHDL and Verilog although in an attempt to

                                                                reduce the complexity of designing in HDLs which have been compared to the

                                                                equivalent of assembly languages there are moves to raise the abstraction level

                                                                through the introduction of alternative languages

                                                                62

                                                                To simplify the design of complex systems in FPGAs there exist libraries of

                                                                predefined complex functions and circuits that have been tested and optimized

                                                                to speed up the design process These predefined circuits are commonly called

                                                                IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                (rarely free and typically released under proprietary licenses) Other

                                                                predefined circuits are available from developer communities such as Open

                                                                Cores (typically free and released under the GPL BSD or similar license) and

                                                                other sources

                                                                In a typical design flow an FPGA application developer will simulate the

                                                                design at multiple stages throughout the design process Initially the RTL

                                                                description in VHDL or Verilog is simulated by creating test benches to

                                                                simulate the system and observe results Then after the synthesis engine has

                                                                mapped the design to a netlist the netlist is translated to a gate level

                                                                description where simulation is repeated to confirm the synthesis proceeded

                                                                without errors Finally the design is laid out in the FPGA at which point

                                                                propagation delays can be added and the simulation run again with these

                                                                values back-annotated onto the netlist

                                                                Basic Process Technology Types

                                                                SRAM - based on static memory technology In-system programmable and re-

                                                                programmable Requires external boot devices CMOS

                                                                Antifuse - One-time programmable CMOS

                                                                EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                one-time programmable in production because of plastic packaging

                                                                Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                technology Can be erased even in plastic packages Some but not all

                                                                EEPROM devices can be in-system programmed CMOS

                                                                Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                packages Some but not all flash devices can be in-system programmed

                                                                63

                                                                Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                therefore less expensive to manufacture CMOS

                                                                Fuse - One-time programmable Bipolar

                                                                Major Manufacturers

                                                                Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                rivals Together they control over 80 percent of the market with Xilinx alone

                                                                representing over 50 percent

                                                                Xilinx also provides free Windows and Linux design software while Altera

                                                                provides free Windows tools the Solaris and Linux tools are only available via

                                                                a rental scheme

                                                                Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                focused CSSP no general purpose FPGAs)

                                                                125 FPGA prototype

                                                                FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                hardware verification and early software development

                                                                Main stream verification methods for hardware design and early software and

                                                                firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                design on FPGA has become a good method to do this

                                                                64

                                                                Reasons why Prototyping is important

                                                                1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                functionally correct This is compared to designers only relying on software

                                                                simulations to verify that their hardware design is sound Simulation speed and

                                                                modeling accuracy limitations hinder this development

                                                                2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                the foundry to start on software tests FPGA prototyping allows for much more

                                                                time in area of software development and testing at the software-hardware

                                                                integration stage This allows many unforeseen software bugs that appear due

                                                                to todays array of operating systems applications and hardware

                                                                3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                system work well together off the simulation stage and in actual form

                                                                4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                bringing in interest early This speeds up the overall development cycle and

                                                                allows for more enhancement or improvement to the chip features as it would

                                                                otherwise have been

                                                                65

                                                                13 SIMULATION RESULTS

                                                                66

                                                                GATE LEVEL

                                                                FigVGA CONTROLLER

                                                                67

                                                                GATE LEVEL

                                                                Fig SYNC UNIT

                                                                68

                                                                TECHNOLOGY SCHEMATIC

                                                                69

                                                                14 APPLICATIONS

                                                                Motion Tracking Systems

                                                                Image processing systems

                                                                Displaying Systems

                                                                141 CONCLUTION

                                                                Hence by implementing FPGA taking a processor-generated picture

                                                                (pixilated) from memory space the Controller provides digital RGB values for

                                                                each pixel as well as horizontal and vertical synchronization signals in order

                                                                to correctly display the picture on a connected monitor

                                                                70

                                                                REFERENCES

                                                                1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                2 A VGA display controller by Eduardo Sanchez

                                                                3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                71

                                                                • Pixel
                                                                  • 6 SIGNALS IN VGA CONTROLLER
                                                                  • Gates
                                                                  • Market size
                                                                  • 122 FPGA Comparisons

                                                                  10HARDWARE DESCRIPTIVE LANGUAGE (VHDL)

                                                                  Why (V) HDL

                                                                  Interoperability

                                                                  Technology independence

                                                                  Design reuse

                                                                  Several levels of abstraction

                                                                  Readability

                                                                  Standard language

                                                                  Widely supported

                                                                  What is VHDL

                                                                  VHDL = VHSIC Hardware Description Language(VHSIC = Very High-Speed

                                                                  IC)

                                                                  Design specification language

                                                                  Design entry language

                                                                  Design simulation language

                                                                  Design documentation language

                                                                  An alternative to schematics

                                                                  101 Brief History

                                                                  VHDL Was developed in the early 1980s for managing design problems that

                                                                  involved large circuits and multiple teams of engineers

                                                                  Funded by US Department of Defence

                                                                  33

                                                                  The first publicly available version was released in 1985

                                                                  In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                                  presented with a proposal to standardize the VHDL

                                                                  In 1987 standardization =gt IEEE 1076-1987

                                                                  An improved version of the language was released in 1994 =gt IEEE

                                                                  standard1076-1993

                                                                  102 Related Standards

                                                                  IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                                  impedance

                                                                  Soon after IEEE 1076-1987 was released simulator companies began using

                                                                  their own non-standard types =gt VHDL was becoming a nonstandard

                                                                  IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                                  definitions for a nine-valued data type std_logic

                                                                  IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                                  to actual hardware

                                                                  Defines eg two numeric types signed and unsigned

                                                                  VHDL Environment

                                                                  Design Units

                                                                  34

                                                                  Segments of VHDL code that can be compiled separately and stored in a

                                                                  library

                                                                  Entities

                                                                  A black box with interface definition

                                                                  Defines the inputsoutputs of a component (define pins)

                                                                  A way to represent modularity in VHDL

                                                                  Similar to symbol in schematic

                                                                  Entity declaration describes entity

                                                                  Eg

                                                                  entity Comparator is

                                                                  port (A B in std_logic_vector(7 downto0)

                                                                  EQ out std_logic)

                                                                  end Comparator

                                                                  Ports Provide channels of communication between the component and its

                                                                  environment

                                                                  Each port must have a name direction and a type

                                                                  An entity may have NO port declaration

                                                                  35

                                                                  Port directions In A value of a port can be read inside the component but cannot be assigned

                                                                  Multiple reads of port are allowed

                                                                  Out Assignments can be made to a port but data from a port cannot be read

                                                                  Multiple assignments are allowed

                                                                  In out Bi-directional assignments can be made and data can be read Multiple

                                                                  assignments are allowed

                                                                  Buffer An out port with read capability May have at most one assignment

                                                                  (are not recommended)

                                                                  Architectures Every entity has at least one architecture

                                                                  One entity can have several architectures

                                                                  Architectures can describe design using

                                                                  BehaviorndashStructurendashDataflow

                                                                  Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                                  Transfer Level)ndashBehavioral level

                                                                  Configuration declaration links architecture to entity

                                                                  Eg

                                                                  Architecture Comparator1 of Comparator is

                                                                  Begin

                                                                  EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                                  End Comparator1

                                                                  Configurations Links entity declaration and architecture body together

                                                                  Concept of default configuration is a bit messy in VHDL lsquo87

                                                                  ndashLast architecture analyzed links to entity

                                                                  Can be used to change simulation behavior without re-analyzing the VHDL

                                                                  source

                                                                  Complex configuration declarations are ignored in synthesis

                                                                  36

                                                                  Some entities can have eggate level architecture and behavioral

                                                                  architecture

                                                                  Are always optional

                                                                  PackagesPackages contain information common to many design units

                                                                  1 Package declaration

                                                                  --constant declarations

                                                                  ndashtype and subtype declarations

                                                                  ndashfunction and procedure declarations

                                                                  ndashglobal signal declarations

                                                                  ndashfile declarations

                                                                  ndashcomponent declarations

                                                                  2 Package body

                                                                  ndashis not necessary needed

                                                                  ndashfunction bodies

                                                                  ndashprocedure bodies

                                                                  Packages are meant for encapsuling data which can be shared globally among

                                                                  Several design units

                                                                  Consists of declaration part and optional body part

                                                                  Package declaration can contain

                                                                  ndashtype and subtype declarations

                                                                  ndashsubprograms

                                                                  ndashconstants

                                                                  Alias declarations

                                                                  ndashglobal signal declarations

                                                                  ndashfile declarations

                                                                  ndashcomponent declarations

                                                                  37

                                                                  Package body consists of

                                                                  ndashsubprogram declarations and bodies

                                                                  ndashtype and subtype declarations

                                                                  ndash deferred constants

                                                                  ndash file declarations

                                                                  LibrariesCollection of VHDL design units (database)

                                                                  1 Packages

                                                                  package declaration

                                                                  package body

                                                                  2 Entities (entity declaration)

                                                                  3 Architectures (architecture body)

                                                                  4 Configurations (configuration declarations)

                                                                  Usually directory in UNIX file system

                                                                  Can be also any other kind of database

                                                                  Levels of Abstraction

                                                                  VHDL supports many possible styles of design description which differ

                                                                  primarily in how closely they relate to the HW

                                                                  It is possible to describe a circuit in a number of ways

                                                                  Structural-------

                                                                  Dataflow ------- Higher level of abstraction

                                                                  Behavioral -------

                                                                  Structural VHDL description

                                                                  Circuit is described in terms of its components

                                                                  From a low-level description (eg transistor-level description) to a high level

                                                                  description (eg block diagram)

                                                                  For large circuits a low-level description quickly becomes impractical

                                                                  103 Dataflow VHDL Description

                                                                  38

                                                                  Circuit is described in terms of how data moves through the system

                                                                  In the dataflow style you describe how information flows between registers in

                                                                  the system

                                                                  The combinational logic is described at a relatively high level the placement

                                                                  and operation of registers is specified quite precisely

                                                                  The behavior of the system over the time is defined by registers

                                                                  There are no build-in registers in VHDL-language

                                                                  ndashEither lower level description

                                                                  ndashor behavioral description of sequential elements is needed

                                                                  The lower level register descriptions must be created or obtained

                                                                  If there is no 3rd party models for registers =gt you must write the behavioral

                                                                  description of registers

                                                                  The behavioral description can be provided in the form of

                                                                  subprograms(functions or procedures)

                                                                  104 Behavioral VHDL Description

                                                                  Circuit is described in terms of its operation over time

                                                                  39

                                                                  Representation might include eg state diagrams timing diagrams and

                                                                  algorithmic descriptions

                                                                  The concept of time may be expressed precisely using delays (eg A lt= B

                                                                  after 10 ns)

                                                                  If no actual delay is used order of sequential operations is defined

                                                                  In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                                  timing specifications

                                                                  The actual timing results depend on implementation technology and efficiency

                                                                  of synthesis tool

                                                                  There are a few tools for behavioral synthesis

                                                                  Concurrent Vs Sequential

                                                                  Processes

                                                                  Basic simulation concept in VHDL

                                                                  VHDL description can always be broken up to interconnected processes

                                                                  Quite similar to Unix process

                                                                  40

                                                                  Process keyword in VHDL

                                                                  Process statement is concurrent statement

                                                                  Statements inside process statements are sequential statements

                                                                  Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                  Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                  General format

                                                                  Process [(sensitivity_list)]

                                                                  process_declarative_part

                                                                  Begin

                                                                  process_statements

                                                                  [wait_statement]

                                                                  End process

                                                                  41

                                                                  11 VGA CONTROLLER CODE

                                                                  library IEEE

                                                                  use IEEESTD_LOGIC_1164ALL

                                                                  use IEEESTD_LOGIC_ARITHALL

                                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                                  use IEEEnumeric_stdALL

                                                                  ---- Uncomment the following library declaration if instantiating

                                                                  ---- any Xilinx primitives in this code

                                                                  --library UNISIM

                                                                  --use UNISIMVComponentsall

                                                                  entity vga_controller is port(clk50min std_logic

                                                                  rstnin std_logic

                                                                  hsyncout std_logic

                                                                  vsyncout std_logic

                                                                  redout std_logic_vector(0 downto 0)

                                                                  greenout std_logic_vector(0 downto 0)

                                                                  blueout std_logic_vector(0 downto 0))

                                                                  end vga_controller

                                                                  architecture Behavioral of vga_controller is

                                                                  component memory

                                                                  port (

                                                                  addr IN std_logic_VECTOR(17 downto 0)

                                                                  clk IN std_logic

                                                                  dout OUT std_logic_VECTOR(0 downto 0)

                                                                  en IN std_logic

                                                                  sinit IN std_logic)

                                                                  END component

                                                                  42

                                                                  component sync_unit port(

                                                                  clkin std_logic

                                                                  rstnin std_logic

                                                                  hsyncout std_logic

                                                                  ---horizontal synch pulse(row)

                                                                  vsyncout std_logic

                                                                  --vertical sync pulse(frame) 111110100000000000

                                                                  v_en_regout std_logic

                                                                  h_enout std_logic

                                                                  v_enout std_logic

                                                                  line_countinout integer

                                                                  c_hsinout integer)

                                                                  end component

                                                                  component pixelgen port(

                                                                  clkin std_logic

                                                                  rstnin std_logic

                                                                  hsyncin std_logic

                                                                  vsyncin std_logic

                                                                  h_enin std_logic

                                                                  v_enin std_logic

                                                                  v_en_reg in std_logic

                                                                  data_inin std_logic_vector(0 downto 0)

                                                                  line_countin integer

                                                                  c_hsin integer

                                                                  red out std_logic_vector(0 downto 0)

                                                                  greenout std_logic_vector(0 downto 0)

                                                                  blueout std_logic_vector(0 downto 0)

                                                                  addrout std_logic_vector(17 downto 0)

                                                                  rdout std_logic)

                                                                  end component

                                                                  43

                                                                  signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                  signal line_count_sc_hs_saddrinteger

                                                                  signal data_sstd_logic_vector(0 downto 0)

                                                                  signal addr_sstd_logic_vector(17 downto 0)

                                                                  begin

                                                                  --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                  process(clk50mrstn)

                                                                  begin

                                                                  if(rstn = 0)then

                                                                  clk lt= 0

                                                                  elsif(clk50m = 1 and clk50mevent)then

                                                                  clk lt= not clk

                                                                  end if

                                                                  --end if

                                                                  end process

                                                                  hsync lt= hsync_s

                                                                  vsync lt= vsync_s

                                                                  sync sync_unit port map(

                                                                  clk =gt clk

                                                                  rstn =gt rstn

                                                                  hsync =gt hsync_s

                                                                  vsync =gt vsync_s

                                                                  v_en_reg =gt v_en_reg_s

                                                                  line_count =gt line_count_s

                                                                  h_en =gt h_en_s

                                                                  v_en =gt v_en_s

                                                                  c_hs =gt c_hs_s

                                                                  )

                                                                  44

                                                                  pixel pixelgen port map(

                                                                  clk =gt clk

                                                                  rstn =gt rstn

                                                                  hsync =gt hsync_s

                                                                  vsync =gt vsync_s

                                                                  h_en =gt h_en_s

                                                                  v_en =gt v_en_s

                                                                  v_en_reg =gt v_en_reg_s

                                                                  data_in =gt data_s

                                                                  line_count=gt line_count_s

                                                                  c_hs =gt c_hs_s

                                                                  red =gt red

                                                                  green =gt green

                                                                  blue =gt blue

                                                                  addr =gt addr_s

                                                                  rd =gt rd_s

                                                                  )

                                                                  mem12 memory port map(

                                                                  addr =gt addr_s

                                                                  clk =gt clk

                                                                  dout =gt data_s

                                                                  en =gt rd_s

                                                                  sinit=gt rstn

                                                                  )

                                                                  end Behavioral

                                                                  45

                                                                  SYNCHRONIZATION BLOCK CODE

                                                                  library IEEE

                                                                  use IEEESTD_LOGIC_1164ALL

                                                                  use IEEESTD_LOGIC_ARITHALL

                                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                                  entity sync_unit is port(

                                                                  clkin std_logic

                                                                  rstnin std_logic

                                                                  hsyncout std_logic ---horizontal synch pulse(row)

                                                                  vsyncout std_logic --vertical sync

                                                                  pulse(frame) 111110100000000000

                                                                  v_en_regout std_logic

                                                                  h_enout std_logic

                                                                  v_enout std_logic

                                                                  line_countout integer

                                                                  c_hsout integer )

                                                                  end sync_unit

                                                                  architecture arch_sync_unit of sync_unit is

                                                                  signal h_en_sv_en_sstd_logic

                                                                  signal line_count_sc_hs_sinteger

                                                                  begin

                                                                  c_hs lt= c_hs_s

                                                                  line_count lt= line_count_s

                                                                  h_en lt= h_en_s

                                                                  v_en lt= v_en_s

                                                                  -- hsync counter

                                                                  process(clkrstn)

                                                                  begin

                                                                  if(rstn = 0)then

                                                                  46

                                                                  c_hs_s lt= 0

                                                                  else

                                                                  if(clk = 1 and clkevent)then

                                                                  if(c_hs_s=793)then

                                                                  c_hs_s lt= 0

                                                                  else

                                                                  c_hs_s lt= c_hs_s +1

                                                                  end if

                                                                  end if

                                                                  end if

                                                                  end process

                                                                  --vsync counter

                                                                  --vertical line counter

                                                                  process(h_en_srstn)

                                                                  begin

                                                                  if(rstn = 0)then

                                                                  line_count_s lt= 0

                                                                  elsif(h_en_s = 1 and h_en_sevent)then

                                                                  line_count_s lt= line_count_s + 1

                                                                  if (line_count_s=515)then

                                                                  line_count_s lt= 0

                                                                  end if

                                                                  end if

                                                                  --end if

                                                                  end process

                                                                  --hysnc pulse

                                                                  process(clkrstn)

                                                                  begin

                                                                  if(rstn = 0)then

                                                                  47

                                                                  hsync lt= 1

                                                                  elsif(clk = 1 and clkevent)then

                                                                  if(c_hs_s lt=95)then

                                                                  hsync lt= 0

                                                                  else

                                                                  hsync lt= 1

                                                                  end if

                                                                  end if

                                                                  --end if

                                                                  end process

                                                                  process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                  this period)

                                                                  begin

                                                                  if(rstn = 0)then

                                                                  h_en_s lt= 0

                                                                  elsif(clk = 1 and clkevent)then

                                                                  if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                  h_en_s lt= 1

                                                                  else

                                                                  h_en_s lt= 0

                                                                  end if

                                                                  end if

                                                                  --end if

                                                                  end process

                                                                  ---vysnc pulse

                                                                  process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                  begin

                                                                  if(rstn = 0)then

                                                                  48

                                                                  v_en_s lt= 0

                                                                  elsif(clk = 1 and clkevent)then

                                                                  if((line_count_sgt=35 and line_count_slt=515))then

                                                                  v_en_s lt= 1

                                                                  else

                                                                  v_en_s lt= 0

                                                                  end if

                                                                  end if

                                                                  --end if

                                                                  end process

                                                                  process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                  begin

                                                                  if(rstn = 0)then

                                                                  vsync lt= 1

                                                                  elsif(clk = 1 and clkevent)then

                                                                  if(line_count_s lt2)then

                                                                  vsync lt= 0

                                                                  else

                                                                  vsync lt= 1

                                                                  end if

                                                                  end if

                                                                  --end if

                                                                  end process

                                                                  process(clk)

                                                                  begin

                                                                  if(clk = 1 and clkevent)then

                                                                  v_en_reglt=v_en_s

                                                                  end if

                                                                  end process

                                                                  end arch_sync_unit

                                                                  49

                                                                  PIXEL GENERATOR

                                                                  library IEEE

                                                                  use IEEESTD_LOGIC_1164ALL

                                                                  use IEEESTD_LOGIC_ARITHALL

                                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                                  entity pixelgen is port(

                                                                  clkin std_logic

                                                                  rstnin std_logic

                                                                  hsyncin std_logic

                                                                  vsyncin std_logic

                                                                  h_enin std_logic

                                                                  v_enin std_logic

                                                                  v_en_reg in std_logic

                                                                  data_inin std_logic_vector(0 downto 0)

                                                                  line_countin integer

                                                                  c_hsin integer

                                                                  red out std_logic_vector(0 downto 0)

                                                                  greenout std_logic_vector(0 downto 0)

                                                                  blueout std_logic_vector(0 downto 0)

                                                                  addrout std_logic_vector(17 downto 0)

                                                                  rdout std_logic

                                                                  )

                                                                  end pixelgen

                                                                  architecture behv of pixelgen is

                                                                  signal addr_sstd_logic_vector(17 downto 0)

                                                                  begin

                                                                  addr lt= addr_s

                                                                  process(clkrstn)

                                                                  begin

                                                                  50

                                                                  if(rstn = 0)then

                                                                  rd lt= 0

                                                                  red lt= 0

                                                                  green lt= 0

                                                                  blue lt= 0

                                                                  addr_s lt= x0000 amp 00

                                                                  elsif (clk = 1 and clkevent)then

                                                                  if((v_en=1) and (v_en_reg=0))then

                                                                  addr_slt=(others =gt0)

                                                                  end if

                                                                  if(v_en=1 and h_en=1)then

                                                                  if(line_countlt=514)then

                                                                  if(c_hs lt= 655)then

                                                                  rd lt= 1

                                                                  red lt= data_in

                                                                  green lt= data_in

                                                                  blue lt= data_in

                                                                  addr_s lt=addr_s +000000000000000001

                                                                  else

                                                                  rd lt= 0

                                                                  red lt= 0

                                                                  green lt= 1

                                                                  blue lt= 0

                                                                  end if

                                                                  else

                                                                  rd lt= 0

                                                                  red lt= 1

                                                                  green lt= 1

                                                                  blue lt= 1

                                                                  51

                                                                  end if

                                                                  if(addr_s=111011111111111111)then

                                                                  addr_slt= (others =gt0)

                                                                  end if

                                                                  else

                                                                  red lt= 0

                                                                  green lt= 0

                                                                  blue lt= 0

                                                                  end if

                                                                  end if

                                                                  -- end if

                                                                  --end if

                                                                  end process

                                                                  end behv

                                                                  52

                                                                  12 FPGA ARCHITECTURE

                                                                  A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                  be configured by the customer or designer after manufacturingmdashhence the

                                                                  name field-programmable FPGAs are programmed using a logic circuit

                                                                  diagram or a source code in a hardware description language (HDL) to specify

                                                                  how the chip will work They can be used to implement any logical function

                                                                  that an application-specific integrated circuit (ASIC) could perform but the

                                                                  ability to update the functionality after shipping offers advantages for many

                                                                  applications

                                                                  FPGAs contain programmable logic components called logic blocks and a

                                                                  hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                  togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                  can be configured to perform complex combinational functions or merely

                                                                  simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                  include memory elements which may be simple flip-flops or more complete

                                                                  blocks of memory

                                                                  History The FPGA industry sprouted from programmable read only

                                                                  memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                  both had the option of being programmed in batches in a factory or in the field

                                                                  (field programmable) however programmable logic was hard-wired between

                                                                  logic gates

                                                                  Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                  first commercially viable field programmable gate array in 1985 ndash the

                                                                  XC2064 The XC2064 had programmable gates and programmable

                                                                  interconnects between gates the beginnings of a new technology and market

                                                                  The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                  input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                  into the National Inventors Hall of Fame for his invention

                                                                  53

                                                                  Some of the industryrsquos foundational concepts and technologies for

                                                                  programmable logic arrays gates and logic blocks are founded in patents

                                                                  awarded to David W Page and LuVerne R Peterson in 1985

                                                                  In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                  proposed by Steve Casselman to develop a computer that would implement

                                                                  600000 reprogrammable gates Casselman was successful and the system was

                                                                  awarded a patent in 1992

                                                                  Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                  1990s when competitors sprouted up eroding significant market-share By

                                                                  1993 Actel was serving about 18 percent of the market

                                                                  The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                  and the volume of production In the early 1990s FPGAs were primarily used

                                                                  in telecommunications and networking By the end of the decade FPGAs

                                                                  found their way into consumer automotive and industrial applications

                                                                  FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                  algorithm technology and FPGAs to create a sound recognition device

                                                                  Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                  to decide the configuration needed to accomplish a sound recognition task

                                                                  121 Modern developments

                                                                  A recent trend has been to take the coarse-grained architectural

                                                                  approach a step further by combining the logic blocks and interconnects of

                                                                  traditional FPGAs with embedded microprocessors and related peripherals to

                                                                  form a complete system on a programmable chip This work mirrors the

                                                                  architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                  Group which combined a reconfigurable CPU architecture on a single chip

                                                                  called the SB24 That work was done in 1982 Examples of such hybrid

                                                                  technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                  which include one or more PowerPC processors embedded within the FPGAs

                                                                  54

                                                                  logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                  processor in combination with Atmels programmable logic architecture

                                                                  An alternate approach to using hard-macro processors is to make use of soft

                                                                  processor cores that are implemented within the FPGA logic (See Soft

                                                                  processors below)

                                                                  As previously mentioned many modern FPGAs have the ability to be

                                                                  reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                  computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                  suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                  example of a reconfigurable soft processor implemented on FPGAs However

                                                                  it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                  to a specific program

                                                                  Additionally new non-FPGA architectures are beginning to emerge

                                                                  Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                  hybrid approach by providing an array of processor cores and FPGA-like

                                                                  programmable cores on the same chip

                                                                  Gates

                                                                  1987 9000 gates Xilinx

                                                                  1992 600000 Naval Surface Warfare Department

                                                                  Early 2000s Millions

                                                                  Market size

                                                                  1985 First commercial FPGA technology invented by Xilinx

                                                                  1987 $14 million

                                                                  ~1993 gt$385 million

                                                                  2005 $19 billion

                                                                  2010 estimates $275 billion

                                                                  55

                                                                  122 FPGA Comparisons

                                                                  Historically FPGAs have been slower less energy efficient and generally

                                                                  achieved less functionality than their fixed ASIC counterparts A combination

                                                                  of volume fabrication improvements research and development and the IO

                                                                  capabilities of new supercomputers have largely closed the performance gap

                                                                  between ASICs and FPGAs

                                                                  Advantages include a shorter time to market ability to re-program in the field

                                                                  to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                  middle road by developing their hardware on ordinary FPGAs but

                                                                  manufacture their final version so it can no longer be modified after the design

                                                                  has been committed

                                                                  Xilinx claims that several market and technology dynamics are changing the

                                                                  ASICFPGA paradigm

                                                                  IC costs are rising aggressively

                                                                  ASIC complexity has bolstered development time and costs

                                                                  RampD resources and headcount is decreasing

                                                                  Revenue losses for slow time-to-market are increasing

                                                                  Financial constraints in a poor economy are driving low-cost technologies

                                                                  These trends make FPGAs a better alternative than ASICs for a growing

                                                                  number of higher-volume applications than they have been historically used

                                                                  for which the company blames for the growing number of FPGA design starts

                                                                  (see History)

                                                                  The primary differences between CPLDs and FPGAs are architectural A

                                                                  CPLD has a somewhat restrictive structure consisting of one or more

                                                                  programmable sum-of-products logic arrays feeding a relatively small number

                                                                  of clocked registers The result of this is less flexibility with the advantage of

                                                                  more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                  FPGA architectures on the other hand are dominated by interconnect This

                                                                  56

                                                                  makes them far more flexible (in terms of the range of designs that are

                                                                  practical for implementation within them) but also far more complex to design

                                                                  for

                                                                  Another notable difference between CPLDs and FPGAs is the presence in most

                                                                  FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                  and embedded memories as well as to have logic blocks implement decoders

                                                                  or mathematical functions

                                                                  Some FPGAs have the capability of partial re-configuration that lets one

                                                                  portion of the device be re-programmed while other portions continue running

                                                                  123 Applications

                                                                  Applications of FPGAs include digital signal processing software-defined

                                                                  radio aerospace and defense systems ASIC prototyping medical imaging

                                                                  computer vision speech recognition cryptography bioinformatics computer

                                                                  hardware emulation radio astronomy and a growing range of other areas

                                                                  FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                  space that of glue logic for PCBs As their size capabilities and speed

                                                                  increased they began to take over larger and larger functions to the state where

                                                                  some are now marketed as full systems on chips (SoC) Particularly with the

                                                                  introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                  applications which had traditionally been the sole reserve of DSPs began to

                                                                  incorporate FPGAs instead

                                                                  FPGAs especially find applications in any area or algorithm that can make use

                                                                  of the massive parallelism offered by their architecture One such area is code

                                                                  breaking in particular brute-force attack of cryptographic algorithms

                                                                  FPGAs are increasingly used in conventional high performance computing

                                                                  applications where computational kernels such as FFT or Convolution are

                                                                  performed on the FPGA instead of a microprocessor

                                                                  57

                                                                  The inherent parallelism of the logic resources on an FPGA allows for

                                                                  considerable computational throughput even at a low MHz clock rates The

                                                                  flexibility of the FPGA allows for even higher performance by trading off

                                                                  precision and range in the number format for an increased number of parallel

                                                                  arithmetic units This has driven a new type of processing called

                                                                  reconfigurable computing where time intensive tasks are offloaded from

                                                                  software to FPGAs

                                                                  The adoption of FPGAs in high performance computing is currently limited by

                                                                  the complexity of FPGA design compared to conventional software and the

                                                                  extremely long turn-around times of current design tools where 4-8 hours wait

                                                                  is necessary after even minor changes to the source code

                                                                  Traditionally FPGAs have been reserved for specific vertical applications

                                                                  where the volume of production is small For these low-volume applications

                                                                  the premium that companies pay in hardware costs per unit for a

                                                                  programmable chip is more affordable than the development resources spent

                                                                  on creating an ASIC for a low-volume application Today new cost and

                                                                  performance dynamics have broadened the range of viable applications

                                                                  Architecture

                                                                  The most common FPGA architecture consists of an array of configurable

                                                                  logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                  channels have the same width (number of wires) Multiple IO pads may fit

                                                                  into the height of one row or the width of one column in the array

                                                                  An application circuit must be mapped into an FPGA with adequate resources

                                                                  While the number of CLBs and IOs required is easily determined from the

                                                                  design the number of routing tracks needed may vary considerably even

                                                                  among designs with the same amount of logic (For example a crossbar switch

                                                                  requires much more routing than a systolic array with the same gate count)

                                                                  Since unused routing tracks increase the cost (and decrease the performance)

                                                                  of the part without providing any benefit FPGA manufacturers try to provide

                                                                  58

                                                                  just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                  can be routed This is determined by estimates such as those derived from

                                                                  Rents rule or by experiments with existing designs

                                                                  The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                  blocks and routing channels Two IO pads fit into the height of one row or the

                                                                  width of one column as shown below All the routing channels have the same

                                                                  width (number of wires)

                                                                  121 FPGA structure

                                                                  A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                  flip-flop as shown below In recent years manufacturers have started moving

                                                                  to 6-input LUTs in their high performance parts claiming increased

                                                                  performance

                                                                  Typical logic block

                                                                  59

                                                                  There is only one output which can be either the registered or the unregistered

                                                                  LUT output The logic block has four inputs for the LUT and a clock input

                                                                  Since clock signals (and often other high-fanout signals) are normally routed

                                                                  via special-purpose dedicated routing networks in commercial FPGAs they

                                                                  and other signals are separately managed

                                                                  For this example architecture the locations of the FPGA logic block pins are

                                                                  shown below

                                                                  Logic Block Pin Locations

                                                                  Each input is accessible from one side of the logic block while the output pin

                                                                  can connect to routing wires in both the channel to the right and the channel

                                                                  below the logic block

                                                                  Each logic block output pin can connect to any of the wiring segments in the

                                                                  channels adjacent to it

                                                                  Similarly an IO pad can connect to any one of the wiring segments in the

                                                                  channel adjacent to it For example an IO pad at the top of the chip can

                                                                  connect to any of the W wires (where W is the channel width) in the horizontal

                                                                  channel immediately below it

                                                                  Generally the FPGA routing is unsegmented That is each wiring segment

                                                                  spans only one logic block before it terminates in a switch box By turning on

                                                                  60

                                                                  some of the programmable switches within a switch box longer paths can be

                                                                  constructed For higher speed interconnect some FPGA architectures use

                                                                  longer routing lines that span multiple logic blocks

                                                                  Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                  this architecture when a wire enters a switch box there are three

                                                                  programmable switches that allow it to connect to three other wires in adjacent

                                                                  channel segments The pattern or topology of switches used in this

                                                                  architecture is the planar or domain-based switch box topology In this switch

                                                                  box topology a wire in track number one connects only to wires in track

                                                                  number one in adjacent channel segments wires in track number 2 connect

                                                                  only to other wires in track number 2 and so on The figure below illustrates

                                                                  the connections in a switch box

                                                                  Switch box topology

                                                                  Modern FPGA families expand upon the above capabilities to include higher

                                                                  level functionality fixed into the silicon Having these common functions

                                                                  embedded into the silicon reduces the area required and gives those functions

                                                                  increased speed compared to building them from primitives Examples of these

                                                                  include multipliers generic DSP blocks embedded processors high speed IO

                                                                  logic and embedded memories

                                                                  61

                                                                  FPGAs are also widely used for systems validation including pre-silicon

                                                                  validation post-silicon validation and firmware development This allows chip

                                                                  companies to validate their design before the chip is produced in the factory

                                                                  reducing the time to market

                                                                  124 FPGA Design and Programming

                                                                  To define the behavior of the FPGA the user provides a hardware description

                                                                  language (HDL) or a schematic design The HDL form might be easier to work

                                                                  with when handling large structures because its possible to just specify them

                                                                  numerically rather than having to draw every piece by hand On the other hand

                                                                  schematic entry can allow for easier visualisation of a design

                                                                  Then using an electronic design automation tool a technology-mapped netlist

                                                                  is generated The netlist can then be fitted to the actual FPGA architecture

                                                                  using a process called place-and-route usually performed by the FPGA

                                                                  companys proprietary place-and-route software The user will validate the

                                                                  map place and route results via timing analysis simulation and other

                                                                  verification methodologies Once the design and validation process is

                                                                  complete the binary file generated (also using the FPGA companys

                                                                  proprietary software) is used to (re)configure the FPGA

                                                                  Going from schematicHDL source files to actual configuration The source

                                                                  files are fed to a software suite from the FPGACPLD vendor that through

                                                                  different steps will produce a file This file is then transferred to the

                                                                  FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                  like an EEPROM

                                                                  The most common HDLs are VHDL and Verilog although in an attempt to

                                                                  reduce the complexity of designing in HDLs which have been compared to the

                                                                  equivalent of assembly languages there are moves to raise the abstraction level

                                                                  through the introduction of alternative languages

                                                                  62

                                                                  To simplify the design of complex systems in FPGAs there exist libraries of

                                                                  predefined complex functions and circuits that have been tested and optimized

                                                                  to speed up the design process These predefined circuits are commonly called

                                                                  IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                  (rarely free and typically released under proprietary licenses) Other

                                                                  predefined circuits are available from developer communities such as Open

                                                                  Cores (typically free and released under the GPL BSD or similar license) and

                                                                  other sources

                                                                  In a typical design flow an FPGA application developer will simulate the

                                                                  design at multiple stages throughout the design process Initially the RTL

                                                                  description in VHDL or Verilog is simulated by creating test benches to

                                                                  simulate the system and observe results Then after the synthesis engine has

                                                                  mapped the design to a netlist the netlist is translated to a gate level

                                                                  description where simulation is repeated to confirm the synthesis proceeded

                                                                  without errors Finally the design is laid out in the FPGA at which point

                                                                  propagation delays can be added and the simulation run again with these

                                                                  values back-annotated onto the netlist

                                                                  Basic Process Technology Types

                                                                  SRAM - based on static memory technology In-system programmable and re-

                                                                  programmable Requires external boot devices CMOS

                                                                  Antifuse - One-time programmable CMOS

                                                                  EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                  one-time programmable in production because of plastic packaging

                                                                  Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                  EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                  technology Can be erased even in plastic packages Some but not all

                                                                  EEPROM devices can be in-system programmed CMOS

                                                                  Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                  packages Some but not all flash devices can be in-system programmed

                                                                  63

                                                                  Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                  therefore less expensive to manufacture CMOS

                                                                  Fuse - One-time programmable Bipolar

                                                                  Major Manufacturers

                                                                  Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                  rivals Together they control over 80 percent of the market with Xilinx alone

                                                                  representing over 50 percent

                                                                  Xilinx also provides free Windows and Linux design software while Altera

                                                                  provides free Windows tools the Solaris and Linux tools are only available via

                                                                  a rental scheme

                                                                  Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                  (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                  Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                  focused CSSP no general purpose FPGAs)

                                                                  125 FPGA prototype

                                                                  FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                  prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                  hardware verification and early software development

                                                                  Main stream verification methods for hardware design and early software and

                                                                  firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                  design on FPGA has become a good method to do this

                                                                  64

                                                                  Reasons why Prototyping is important

                                                                  1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                  functionally correct This is compared to designers only relying on software

                                                                  simulations to verify that their hardware design is sound Simulation speed and

                                                                  modeling accuracy limitations hinder this development

                                                                  2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                  the foundry to start on software tests FPGA prototyping allows for much more

                                                                  time in area of software development and testing at the software-hardware

                                                                  integration stage This allows many unforeseen software bugs that appear due

                                                                  to todays array of operating systems applications and hardware

                                                                  3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                  system work well together off the simulation stage and in actual form

                                                                  4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                  bringing in interest early This speeds up the overall development cycle and

                                                                  allows for more enhancement or improvement to the chip features as it would

                                                                  otherwise have been

                                                                  65

                                                                  13 SIMULATION RESULTS

                                                                  66

                                                                  GATE LEVEL

                                                                  FigVGA CONTROLLER

                                                                  67

                                                                  GATE LEVEL

                                                                  Fig SYNC UNIT

                                                                  68

                                                                  TECHNOLOGY SCHEMATIC

                                                                  69

                                                                  14 APPLICATIONS

                                                                  Motion Tracking Systems

                                                                  Image processing systems

                                                                  Displaying Systems

                                                                  141 CONCLUTION

                                                                  Hence by implementing FPGA taking a processor-generated picture

                                                                  (pixilated) from memory space the Controller provides digital RGB values for

                                                                  each pixel as well as horizontal and vertical synchronization signals in order

                                                                  to correctly display the picture on a connected monitor

                                                                  70

                                                                  REFERENCES

                                                                  1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                  2 A VGA display controller by Eduardo Sanchez

                                                                  3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                  71

                                                                  • Pixel
                                                                    • 6 SIGNALS IN VGA CONTROLLER
                                                                    • Gates
                                                                    • Market size
                                                                    • 122 FPGA Comparisons

                                                                    The first publicly available version was released in 1985

                                                                    In 1986 IEEE (Institute of Electrical and Electronics Engineers Inc) was

                                                                    presented with a proposal to standardize the VHDL

                                                                    In 1987 standardization =gt IEEE 1076-1987

                                                                    An improved version of the language was released in 1994 =gt IEEE

                                                                    standard1076-1993

                                                                    102 Related Standards

                                                                    IEEE 1076 doesnrsquot support simulation conditions such as unknown and high-

                                                                    impedance

                                                                    Soon after IEEE 1076-1987 was released simulator companies began using

                                                                    their own non-standard types =gt VHDL was becoming a nonstandard

                                                                    IEEE 1164 standard was developed by an IEEE1048715IEEE 1164 contains

                                                                    definitions for a nine-valued data type std_logic

                                                                    IEEE 10763 (Numeric or Synthesis Standard) defines data types as they relate

                                                                    to actual hardware

                                                                    Defines eg two numeric types signed and unsigned

                                                                    VHDL Environment

                                                                    Design Units

                                                                    34

                                                                    Segments of VHDL code that can be compiled separately and stored in a

                                                                    library

                                                                    Entities

                                                                    A black box with interface definition

                                                                    Defines the inputsoutputs of a component (define pins)

                                                                    A way to represent modularity in VHDL

                                                                    Similar to symbol in schematic

                                                                    Entity declaration describes entity

                                                                    Eg

                                                                    entity Comparator is

                                                                    port (A B in std_logic_vector(7 downto0)

                                                                    EQ out std_logic)

                                                                    end Comparator

                                                                    Ports Provide channels of communication between the component and its

                                                                    environment

                                                                    Each port must have a name direction and a type

                                                                    An entity may have NO port declaration

                                                                    35

                                                                    Port directions In A value of a port can be read inside the component but cannot be assigned

                                                                    Multiple reads of port are allowed

                                                                    Out Assignments can be made to a port but data from a port cannot be read

                                                                    Multiple assignments are allowed

                                                                    In out Bi-directional assignments can be made and data can be read Multiple

                                                                    assignments are allowed

                                                                    Buffer An out port with read capability May have at most one assignment

                                                                    (are not recommended)

                                                                    Architectures Every entity has at least one architecture

                                                                    One entity can have several architectures

                                                                    Architectures can describe design using

                                                                    BehaviorndashStructurendashDataflow

                                                                    Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                                    Transfer Level)ndashBehavioral level

                                                                    Configuration declaration links architecture to entity

                                                                    Eg

                                                                    Architecture Comparator1 of Comparator is

                                                                    Begin

                                                                    EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                                    End Comparator1

                                                                    Configurations Links entity declaration and architecture body together

                                                                    Concept of default configuration is a bit messy in VHDL lsquo87

                                                                    ndashLast architecture analyzed links to entity

                                                                    Can be used to change simulation behavior without re-analyzing the VHDL

                                                                    source

                                                                    Complex configuration declarations are ignored in synthesis

                                                                    36

                                                                    Some entities can have eggate level architecture and behavioral

                                                                    architecture

                                                                    Are always optional

                                                                    PackagesPackages contain information common to many design units

                                                                    1 Package declaration

                                                                    --constant declarations

                                                                    ndashtype and subtype declarations

                                                                    ndashfunction and procedure declarations

                                                                    ndashglobal signal declarations

                                                                    ndashfile declarations

                                                                    ndashcomponent declarations

                                                                    2 Package body

                                                                    ndashis not necessary needed

                                                                    ndashfunction bodies

                                                                    ndashprocedure bodies

                                                                    Packages are meant for encapsuling data which can be shared globally among

                                                                    Several design units

                                                                    Consists of declaration part and optional body part

                                                                    Package declaration can contain

                                                                    ndashtype and subtype declarations

                                                                    ndashsubprograms

                                                                    ndashconstants

                                                                    Alias declarations

                                                                    ndashglobal signal declarations

                                                                    ndashfile declarations

                                                                    ndashcomponent declarations

                                                                    37

                                                                    Package body consists of

                                                                    ndashsubprogram declarations and bodies

                                                                    ndashtype and subtype declarations

                                                                    ndash deferred constants

                                                                    ndash file declarations

                                                                    LibrariesCollection of VHDL design units (database)

                                                                    1 Packages

                                                                    package declaration

                                                                    package body

                                                                    2 Entities (entity declaration)

                                                                    3 Architectures (architecture body)

                                                                    4 Configurations (configuration declarations)

                                                                    Usually directory in UNIX file system

                                                                    Can be also any other kind of database

                                                                    Levels of Abstraction

                                                                    VHDL supports many possible styles of design description which differ

                                                                    primarily in how closely they relate to the HW

                                                                    It is possible to describe a circuit in a number of ways

                                                                    Structural-------

                                                                    Dataflow ------- Higher level of abstraction

                                                                    Behavioral -------

                                                                    Structural VHDL description

                                                                    Circuit is described in terms of its components

                                                                    From a low-level description (eg transistor-level description) to a high level

                                                                    description (eg block diagram)

                                                                    For large circuits a low-level description quickly becomes impractical

                                                                    103 Dataflow VHDL Description

                                                                    38

                                                                    Circuit is described in terms of how data moves through the system

                                                                    In the dataflow style you describe how information flows between registers in

                                                                    the system

                                                                    The combinational logic is described at a relatively high level the placement

                                                                    and operation of registers is specified quite precisely

                                                                    The behavior of the system over the time is defined by registers

                                                                    There are no build-in registers in VHDL-language

                                                                    ndashEither lower level description

                                                                    ndashor behavioral description of sequential elements is needed

                                                                    The lower level register descriptions must be created or obtained

                                                                    If there is no 3rd party models for registers =gt you must write the behavioral

                                                                    description of registers

                                                                    The behavioral description can be provided in the form of

                                                                    subprograms(functions or procedures)

                                                                    104 Behavioral VHDL Description

                                                                    Circuit is described in terms of its operation over time

                                                                    39

                                                                    Representation might include eg state diagrams timing diagrams and

                                                                    algorithmic descriptions

                                                                    The concept of time may be expressed precisely using delays (eg A lt= B

                                                                    after 10 ns)

                                                                    If no actual delay is used order of sequential operations is defined

                                                                    In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                                    timing specifications

                                                                    The actual timing results depend on implementation technology and efficiency

                                                                    of synthesis tool

                                                                    There are a few tools for behavioral synthesis

                                                                    Concurrent Vs Sequential

                                                                    Processes

                                                                    Basic simulation concept in VHDL

                                                                    VHDL description can always be broken up to interconnected processes

                                                                    Quite similar to Unix process

                                                                    40

                                                                    Process keyword in VHDL

                                                                    Process statement is concurrent statement

                                                                    Statements inside process statements are sequential statements

                                                                    Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                    Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                    General format

                                                                    Process [(sensitivity_list)]

                                                                    process_declarative_part

                                                                    Begin

                                                                    process_statements

                                                                    [wait_statement]

                                                                    End process

                                                                    41

                                                                    11 VGA CONTROLLER CODE

                                                                    library IEEE

                                                                    use IEEESTD_LOGIC_1164ALL

                                                                    use IEEESTD_LOGIC_ARITHALL

                                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                                    use IEEEnumeric_stdALL

                                                                    ---- Uncomment the following library declaration if instantiating

                                                                    ---- any Xilinx primitives in this code

                                                                    --library UNISIM

                                                                    --use UNISIMVComponentsall

                                                                    entity vga_controller is port(clk50min std_logic

                                                                    rstnin std_logic

                                                                    hsyncout std_logic

                                                                    vsyncout std_logic

                                                                    redout std_logic_vector(0 downto 0)

                                                                    greenout std_logic_vector(0 downto 0)

                                                                    blueout std_logic_vector(0 downto 0))

                                                                    end vga_controller

                                                                    architecture Behavioral of vga_controller is

                                                                    component memory

                                                                    port (

                                                                    addr IN std_logic_VECTOR(17 downto 0)

                                                                    clk IN std_logic

                                                                    dout OUT std_logic_VECTOR(0 downto 0)

                                                                    en IN std_logic

                                                                    sinit IN std_logic)

                                                                    END component

                                                                    42

                                                                    component sync_unit port(

                                                                    clkin std_logic

                                                                    rstnin std_logic

                                                                    hsyncout std_logic

                                                                    ---horizontal synch pulse(row)

                                                                    vsyncout std_logic

                                                                    --vertical sync pulse(frame) 111110100000000000

                                                                    v_en_regout std_logic

                                                                    h_enout std_logic

                                                                    v_enout std_logic

                                                                    line_countinout integer

                                                                    c_hsinout integer)

                                                                    end component

                                                                    component pixelgen port(

                                                                    clkin std_logic

                                                                    rstnin std_logic

                                                                    hsyncin std_logic

                                                                    vsyncin std_logic

                                                                    h_enin std_logic

                                                                    v_enin std_logic

                                                                    v_en_reg in std_logic

                                                                    data_inin std_logic_vector(0 downto 0)

                                                                    line_countin integer

                                                                    c_hsin integer

                                                                    red out std_logic_vector(0 downto 0)

                                                                    greenout std_logic_vector(0 downto 0)

                                                                    blueout std_logic_vector(0 downto 0)

                                                                    addrout std_logic_vector(17 downto 0)

                                                                    rdout std_logic)

                                                                    end component

                                                                    43

                                                                    signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                    signal line_count_sc_hs_saddrinteger

                                                                    signal data_sstd_logic_vector(0 downto 0)

                                                                    signal addr_sstd_logic_vector(17 downto 0)

                                                                    begin

                                                                    --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                    process(clk50mrstn)

                                                                    begin

                                                                    if(rstn = 0)then

                                                                    clk lt= 0

                                                                    elsif(clk50m = 1 and clk50mevent)then

                                                                    clk lt= not clk

                                                                    end if

                                                                    --end if

                                                                    end process

                                                                    hsync lt= hsync_s

                                                                    vsync lt= vsync_s

                                                                    sync sync_unit port map(

                                                                    clk =gt clk

                                                                    rstn =gt rstn

                                                                    hsync =gt hsync_s

                                                                    vsync =gt vsync_s

                                                                    v_en_reg =gt v_en_reg_s

                                                                    line_count =gt line_count_s

                                                                    h_en =gt h_en_s

                                                                    v_en =gt v_en_s

                                                                    c_hs =gt c_hs_s

                                                                    )

                                                                    44

                                                                    pixel pixelgen port map(

                                                                    clk =gt clk

                                                                    rstn =gt rstn

                                                                    hsync =gt hsync_s

                                                                    vsync =gt vsync_s

                                                                    h_en =gt h_en_s

                                                                    v_en =gt v_en_s

                                                                    v_en_reg =gt v_en_reg_s

                                                                    data_in =gt data_s

                                                                    line_count=gt line_count_s

                                                                    c_hs =gt c_hs_s

                                                                    red =gt red

                                                                    green =gt green

                                                                    blue =gt blue

                                                                    addr =gt addr_s

                                                                    rd =gt rd_s

                                                                    )

                                                                    mem12 memory port map(

                                                                    addr =gt addr_s

                                                                    clk =gt clk

                                                                    dout =gt data_s

                                                                    en =gt rd_s

                                                                    sinit=gt rstn

                                                                    )

                                                                    end Behavioral

                                                                    45

                                                                    SYNCHRONIZATION BLOCK CODE

                                                                    library IEEE

                                                                    use IEEESTD_LOGIC_1164ALL

                                                                    use IEEESTD_LOGIC_ARITHALL

                                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                                    entity sync_unit is port(

                                                                    clkin std_logic

                                                                    rstnin std_logic

                                                                    hsyncout std_logic ---horizontal synch pulse(row)

                                                                    vsyncout std_logic --vertical sync

                                                                    pulse(frame) 111110100000000000

                                                                    v_en_regout std_logic

                                                                    h_enout std_logic

                                                                    v_enout std_logic

                                                                    line_countout integer

                                                                    c_hsout integer )

                                                                    end sync_unit

                                                                    architecture arch_sync_unit of sync_unit is

                                                                    signal h_en_sv_en_sstd_logic

                                                                    signal line_count_sc_hs_sinteger

                                                                    begin

                                                                    c_hs lt= c_hs_s

                                                                    line_count lt= line_count_s

                                                                    h_en lt= h_en_s

                                                                    v_en lt= v_en_s

                                                                    -- hsync counter

                                                                    process(clkrstn)

                                                                    begin

                                                                    if(rstn = 0)then

                                                                    46

                                                                    c_hs_s lt= 0

                                                                    else

                                                                    if(clk = 1 and clkevent)then

                                                                    if(c_hs_s=793)then

                                                                    c_hs_s lt= 0

                                                                    else

                                                                    c_hs_s lt= c_hs_s +1

                                                                    end if

                                                                    end if

                                                                    end if

                                                                    end process

                                                                    --vsync counter

                                                                    --vertical line counter

                                                                    process(h_en_srstn)

                                                                    begin

                                                                    if(rstn = 0)then

                                                                    line_count_s lt= 0

                                                                    elsif(h_en_s = 1 and h_en_sevent)then

                                                                    line_count_s lt= line_count_s + 1

                                                                    if (line_count_s=515)then

                                                                    line_count_s lt= 0

                                                                    end if

                                                                    end if

                                                                    --end if

                                                                    end process

                                                                    --hysnc pulse

                                                                    process(clkrstn)

                                                                    begin

                                                                    if(rstn = 0)then

                                                                    47

                                                                    hsync lt= 1

                                                                    elsif(clk = 1 and clkevent)then

                                                                    if(c_hs_s lt=95)then

                                                                    hsync lt= 0

                                                                    else

                                                                    hsync lt= 1

                                                                    end if

                                                                    end if

                                                                    --end if

                                                                    end process

                                                                    process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                    this period)

                                                                    begin

                                                                    if(rstn = 0)then

                                                                    h_en_s lt= 0

                                                                    elsif(clk = 1 and clkevent)then

                                                                    if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                    h_en_s lt= 1

                                                                    else

                                                                    h_en_s lt= 0

                                                                    end if

                                                                    end if

                                                                    --end if

                                                                    end process

                                                                    ---vysnc pulse

                                                                    process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                    begin

                                                                    if(rstn = 0)then

                                                                    48

                                                                    v_en_s lt= 0

                                                                    elsif(clk = 1 and clkevent)then

                                                                    if((line_count_sgt=35 and line_count_slt=515))then

                                                                    v_en_s lt= 1

                                                                    else

                                                                    v_en_s lt= 0

                                                                    end if

                                                                    end if

                                                                    --end if

                                                                    end process

                                                                    process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                    begin

                                                                    if(rstn = 0)then

                                                                    vsync lt= 1

                                                                    elsif(clk = 1 and clkevent)then

                                                                    if(line_count_s lt2)then

                                                                    vsync lt= 0

                                                                    else

                                                                    vsync lt= 1

                                                                    end if

                                                                    end if

                                                                    --end if

                                                                    end process

                                                                    process(clk)

                                                                    begin

                                                                    if(clk = 1 and clkevent)then

                                                                    v_en_reglt=v_en_s

                                                                    end if

                                                                    end process

                                                                    end arch_sync_unit

                                                                    49

                                                                    PIXEL GENERATOR

                                                                    library IEEE

                                                                    use IEEESTD_LOGIC_1164ALL

                                                                    use IEEESTD_LOGIC_ARITHALL

                                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                                    entity pixelgen is port(

                                                                    clkin std_logic

                                                                    rstnin std_logic

                                                                    hsyncin std_logic

                                                                    vsyncin std_logic

                                                                    h_enin std_logic

                                                                    v_enin std_logic

                                                                    v_en_reg in std_logic

                                                                    data_inin std_logic_vector(0 downto 0)

                                                                    line_countin integer

                                                                    c_hsin integer

                                                                    red out std_logic_vector(0 downto 0)

                                                                    greenout std_logic_vector(0 downto 0)

                                                                    blueout std_logic_vector(0 downto 0)

                                                                    addrout std_logic_vector(17 downto 0)

                                                                    rdout std_logic

                                                                    )

                                                                    end pixelgen

                                                                    architecture behv of pixelgen is

                                                                    signal addr_sstd_logic_vector(17 downto 0)

                                                                    begin

                                                                    addr lt= addr_s

                                                                    process(clkrstn)

                                                                    begin

                                                                    50

                                                                    if(rstn = 0)then

                                                                    rd lt= 0

                                                                    red lt= 0

                                                                    green lt= 0

                                                                    blue lt= 0

                                                                    addr_s lt= x0000 amp 00

                                                                    elsif (clk = 1 and clkevent)then

                                                                    if((v_en=1) and (v_en_reg=0))then

                                                                    addr_slt=(others =gt0)

                                                                    end if

                                                                    if(v_en=1 and h_en=1)then

                                                                    if(line_countlt=514)then

                                                                    if(c_hs lt= 655)then

                                                                    rd lt= 1

                                                                    red lt= data_in

                                                                    green lt= data_in

                                                                    blue lt= data_in

                                                                    addr_s lt=addr_s +000000000000000001

                                                                    else

                                                                    rd lt= 0

                                                                    red lt= 0

                                                                    green lt= 1

                                                                    blue lt= 0

                                                                    end if

                                                                    else

                                                                    rd lt= 0

                                                                    red lt= 1

                                                                    green lt= 1

                                                                    blue lt= 1

                                                                    51

                                                                    end if

                                                                    if(addr_s=111011111111111111)then

                                                                    addr_slt= (others =gt0)

                                                                    end if

                                                                    else

                                                                    red lt= 0

                                                                    green lt= 0

                                                                    blue lt= 0

                                                                    end if

                                                                    end if

                                                                    -- end if

                                                                    --end if

                                                                    end process

                                                                    end behv

                                                                    52

                                                                    12 FPGA ARCHITECTURE

                                                                    A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                    be configured by the customer or designer after manufacturingmdashhence the

                                                                    name field-programmable FPGAs are programmed using a logic circuit

                                                                    diagram or a source code in a hardware description language (HDL) to specify

                                                                    how the chip will work They can be used to implement any logical function

                                                                    that an application-specific integrated circuit (ASIC) could perform but the

                                                                    ability to update the functionality after shipping offers advantages for many

                                                                    applications

                                                                    FPGAs contain programmable logic components called logic blocks and a

                                                                    hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                    togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                    can be configured to perform complex combinational functions or merely

                                                                    simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                    include memory elements which may be simple flip-flops or more complete

                                                                    blocks of memory

                                                                    History The FPGA industry sprouted from programmable read only

                                                                    memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                    both had the option of being programmed in batches in a factory or in the field

                                                                    (field programmable) however programmable logic was hard-wired between

                                                                    logic gates

                                                                    Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                    first commercially viable field programmable gate array in 1985 ndash the

                                                                    XC2064 The XC2064 had programmable gates and programmable

                                                                    interconnects between gates the beginnings of a new technology and market

                                                                    The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                    input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                    into the National Inventors Hall of Fame for his invention

                                                                    53

                                                                    Some of the industryrsquos foundational concepts and technologies for

                                                                    programmable logic arrays gates and logic blocks are founded in patents

                                                                    awarded to David W Page and LuVerne R Peterson in 1985

                                                                    In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                    proposed by Steve Casselman to develop a computer that would implement

                                                                    600000 reprogrammable gates Casselman was successful and the system was

                                                                    awarded a patent in 1992

                                                                    Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                    1990s when competitors sprouted up eroding significant market-share By

                                                                    1993 Actel was serving about 18 percent of the market

                                                                    The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                    and the volume of production In the early 1990s FPGAs were primarily used

                                                                    in telecommunications and networking By the end of the decade FPGAs

                                                                    found their way into consumer automotive and industrial applications

                                                                    FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                    algorithm technology and FPGAs to create a sound recognition device

                                                                    Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                    to decide the configuration needed to accomplish a sound recognition task

                                                                    121 Modern developments

                                                                    A recent trend has been to take the coarse-grained architectural

                                                                    approach a step further by combining the logic blocks and interconnects of

                                                                    traditional FPGAs with embedded microprocessors and related peripherals to

                                                                    form a complete system on a programmable chip This work mirrors the

                                                                    architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                    Group which combined a reconfigurable CPU architecture on a single chip

                                                                    called the SB24 That work was done in 1982 Examples of such hybrid

                                                                    technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                    which include one or more PowerPC processors embedded within the FPGAs

                                                                    54

                                                                    logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                    processor in combination with Atmels programmable logic architecture

                                                                    An alternate approach to using hard-macro processors is to make use of soft

                                                                    processor cores that are implemented within the FPGA logic (See Soft

                                                                    processors below)

                                                                    As previously mentioned many modern FPGAs have the ability to be

                                                                    reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                    computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                    suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                    example of a reconfigurable soft processor implemented on FPGAs However

                                                                    it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                    to a specific program

                                                                    Additionally new non-FPGA architectures are beginning to emerge

                                                                    Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                    hybrid approach by providing an array of processor cores and FPGA-like

                                                                    programmable cores on the same chip

                                                                    Gates

                                                                    1987 9000 gates Xilinx

                                                                    1992 600000 Naval Surface Warfare Department

                                                                    Early 2000s Millions

                                                                    Market size

                                                                    1985 First commercial FPGA technology invented by Xilinx

                                                                    1987 $14 million

                                                                    ~1993 gt$385 million

                                                                    2005 $19 billion

                                                                    2010 estimates $275 billion

                                                                    55

                                                                    122 FPGA Comparisons

                                                                    Historically FPGAs have been slower less energy efficient and generally

                                                                    achieved less functionality than their fixed ASIC counterparts A combination

                                                                    of volume fabrication improvements research and development and the IO

                                                                    capabilities of new supercomputers have largely closed the performance gap

                                                                    between ASICs and FPGAs

                                                                    Advantages include a shorter time to market ability to re-program in the field

                                                                    to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                    middle road by developing their hardware on ordinary FPGAs but

                                                                    manufacture their final version so it can no longer be modified after the design

                                                                    has been committed

                                                                    Xilinx claims that several market and technology dynamics are changing the

                                                                    ASICFPGA paradigm

                                                                    IC costs are rising aggressively

                                                                    ASIC complexity has bolstered development time and costs

                                                                    RampD resources and headcount is decreasing

                                                                    Revenue losses for slow time-to-market are increasing

                                                                    Financial constraints in a poor economy are driving low-cost technologies

                                                                    These trends make FPGAs a better alternative than ASICs for a growing

                                                                    number of higher-volume applications than they have been historically used

                                                                    for which the company blames for the growing number of FPGA design starts

                                                                    (see History)

                                                                    The primary differences between CPLDs and FPGAs are architectural A

                                                                    CPLD has a somewhat restrictive structure consisting of one or more

                                                                    programmable sum-of-products logic arrays feeding a relatively small number

                                                                    of clocked registers The result of this is less flexibility with the advantage of

                                                                    more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                    FPGA architectures on the other hand are dominated by interconnect This

                                                                    56

                                                                    makes them far more flexible (in terms of the range of designs that are

                                                                    practical for implementation within them) but also far more complex to design

                                                                    for

                                                                    Another notable difference between CPLDs and FPGAs is the presence in most

                                                                    FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                    and embedded memories as well as to have logic blocks implement decoders

                                                                    or mathematical functions

                                                                    Some FPGAs have the capability of partial re-configuration that lets one

                                                                    portion of the device be re-programmed while other portions continue running

                                                                    123 Applications

                                                                    Applications of FPGAs include digital signal processing software-defined

                                                                    radio aerospace and defense systems ASIC prototyping medical imaging

                                                                    computer vision speech recognition cryptography bioinformatics computer

                                                                    hardware emulation radio astronomy and a growing range of other areas

                                                                    FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                    space that of glue logic for PCBs As their size capabilities and speed

                                                                    increased they began to take over larger and larger functions to the state where

                                                                    some are now marketed as full systems on chips (SoC) Particularly with the

                                                                    introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                    applications which had traditionally been the sole reserve of DSPs began to

                                                                    incorporate FPGAs instead

                                                                    FPGAs especially find applications in any area or algorithm that can make use

                                                                    of the massive parallelism offered by their architecture One such area is code

                                                                    breaking in particular brute-force attack of cryptographic algorithms

                                                                    FPGAs are increasingly used in conventional high performance computing

                                                                    applications where computational kernels such as FFT or Convolution are

                                                                    performed on the FPGA instead of a microprocessor

                                                                    57

                                                                    The inherent parallelism of the logic resources on an FPGA allows for

                                                                    considerable computational throughput even at a low MHz clock rates The

                                                                    flexibility of the FPGA allows for even higher performance by trading off

                                                                    precision and range in the number format for an increased number of parallel

                                                                    arithmetic units This has driven a new type of processing called

                                                                    reconfigurable computing where time intensive tasks are offloaded from

                                                                    software to FPGAs

                                                                    The adoption of FPGAs in high performance computing is currently limited by

                                                                    the complexity of FPGA design compared to conventional software and the

                                                                    extremely long turn-around times of current design tools where 4-8 hours wait

                                                                    is necessary after even minor changes to the source code

                                                                    Traditionally FPGAs have been reserved for specific vertical applications

                                                                    where the volume of production is small For these low-volume applications

                                                                    the premium that companies pay in hardware costs per unit for a

                                                                    programmable chip is more affordable than the development resources spent

                                                                    on creating an ASIC for a low-volume application Today new cost and

                                                                    performance dynamics have broadened the range of viable applications

                                                                    Architecture

                                                                    The most common FPGA architecture consists of an array of configurable

                                                                    logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                    channels have the same width (number of wires) Multiple IO pads may fit

                                                                    into the height of one row or the width of one column in the array

                                                                    An application circuit must be mapped into an FPGA with adequate resources

                                                                    While the number of CLBs and IOs required is easily determined from the

                                                                    design the number of routing tracks needed may vary considerably even

                                                                    among designs with the same amount of logic (For example a crossbar switch

                                                                    requires much more routing than a systolic array with the same gate count)

                                                                    Since unused routing tracks increase the cost (and decrease the performance)

                                                                    of the part without providing any benefit FPGA manufacturers try to provide

                                                                    58

                                                                    just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                    can be routed This is determined by estimates such as those derived from

                                                                    Rents rule or by experiments with existing designs

                                                                    The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                    blocks and routing channels Two IO pads fit into the height of one row or the

                                                                    width of one column as shown below All the routing channels have the same

                                                                    width (number of wires)

                                                                    121 FPGA structure

                                                                    A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                    flip-flop as shown below In recent years manufacturers have started moving

                                                                    to 6-input LUTs in their high performance parts claiming increased

                                                                    performance

                                                                    Typical logic block

                                                                    59

                                                                    There is only one output which can be either the registered or the unregistered

                                                                    LUT output The logic block has four inputs for the LUT and a clock input

                                                                    Since clock signals (and often other high-fanout signals) are normally routed

                                                                    via special-purpose dedicated routing networks in commercial FPGAs they

                                                                    and other signals are separately managed

                                                                    For this example architecture the locations of the FPGA logic block pins are

                                                                    shown below

                                                                    Logic Block Pin Locations

                                                                    Each input is accessible from one side of the logic block while the output pin

                                                                    can connect to routing wires in both the channel to the right and the channel

                                                                    below the logic block

                                                                    Each logic block output pin can connect to any of the wiring segments in the

                                                                    channels adjacent to it

                                                                    Similarly an IO pad can connect to any one of the wiring segments in the

                                                                    channel adjacent to it For example an IO pad at the top of the chip can

                                                                    connect to any of the W wires (where W is the channel width) in the horizontal

                                                                    channel immediately below it

                                                                    Generally the FPGA routing is unsegmented That is each wiring segment

                                                                    spans only one logic block before it terminates in a switch box By turning on

                                                                    60

                                                                    some of the programmable switches within a switch box longer paths can be

                                                                    constructed For higher speed interconnect some FPGA architectures use

                                                                    longer routing lines that span multiple logic blocks

                                                                    Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                    this architecture when a wire enters a switch box there are three

                                                                    programmable switches that allow it to connect to three other wires in adjacent

                                                                    channel segments The pattern or topology of switches used in this

                                                                    architecture is the planar or domain-based switch box topology In this switch

                                                                    box topology a wire in track number one connects only to wires in track

                                                                    number one in adjacent channel segments wires in track number 2 connect

                                                                    only to other wires in track number 2 and so on The figure below illustrates

                                                                    the connections in a switch box

                                                                    Switch box topology

                                                                    Modern FPGA families expand upon the above capabilities to include higher

                                                                    level functionality fixed into the silicon Having these common functions

                                                                    embedded into the silicon reduces the area required and gives those functions

                                                                    increased speed compared to building them from primitives Examples of these

                                                                    include multipliers generic DSP blocks embedded processors high speed IO

                                                                    logic and embedded memories

                                                                    61

                                                                    FPGAs are also widely used for systems validation including pre-silicon

                                                                    validation post-silicon validation and firmware development This allows chip

                                                                    companies to validate their design before the chip is produced in the factory

                                                                    reducing the time to market

                                                                    124 FPGA Design and Programming

                                                                    To define the behavior of the FPGA the user provides a hardware description

                                                                    language (HDL) or a schematic design The HDL form might be easier to work

                                                                    with when handling large structures because its possible to just specify them

                                                                    numerically rather than having to draw every piece by hand On the other hand

                                                                    schematic entry can allow for easier visualisation of a design

                                                                    Then using an electronic design automation tool a technology-mapped netlist

                                                                    is generated The netlist can then be fitted to the actual FPGA architecture

                                                                    using a process called place-and-route usually performed by the FPGA

                                                                    companys proprietary place-and-route software The user will validate the

                                                                    map place and route results via timing analysis simulation and other

                                                                    verification methodologies Once the design and validation process is

                                                                    complete the binary file generated (also using the FPGA companys

                                                                    proprietary software) is used to (re)configure the FPGA

                                                                    Going from schematicHDL source files to actual configuration The source

                                                                    files are fed to a software suite from the FPGACPLD vendor that through

                                                                    different steps will produce a file This file is then transferred to the

                                                                    FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                    like an EEPROM

                                                                    The most common HDLs are VHDL and Verilog although in an attempt to

                                                                    reduce the complexity of designing in HDLs which have been compared to the

                                                                    equivalent of assembly languages there are moves to raise the abstraction level

                                                                    through the introduction of alternative languages

                                                                    62

                                                                    To simplify the design of complex systems in FPGAs there exist libraries of

                                                                    predefined complex functions and circuits that have been tested and optimized

                                                                    to speed up the design process These predefined circuits are commonly called

                                                                    IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                    (rarely free and typically released under proprietary licenses) Other

                                                                    predefined circuits are available from developer communities such as Open

                                                                    Cores (typically free and released under the GPL BSD or similar license) and

                                                                    other sources

                                                                    In a typical design flow an FPGA application developer will simulate the

                                                                    design at multiple stages throughout the design process Initially the RTL

                                                                    description in VHDL or Verilog is simulated by creating test benches to

                                                                    simulate the system and observe results Then after the synthesis engine has

                                                                    mapped the design to a netlist the netlist is translated to a gate level

                                                                    description where simulation is repeated to confirm the synthesis proceeded

                                                                    without errors Finally the design is laid out in the FPGA at which point

                                                                    propagation delays can be added and the simulation run again with these

                                                                    values back-annotated onto the netlist

                                                                    Basic Process Technology Types

                                                                    SRAM - based on static memory technology In-system programmable and re-

                                                                    programmable Requires external boot devices CMOS

                                                                    Antifuse - One-time programmable CMOS

                                                                    EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                    one-time programmable in production because of plastic packaging

                                                                    Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                    EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                    technology Can be erased even in plastic packages Some but not all

                                                                    EEPROM devices can be in-system programmed CMOS

                                                                    Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                    packages Some but not all flash devices can be in-system programmed

                                                                    63

                                                                    Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                    therefore less expensive to manufacture CMOS

                                                                    Fuse - One-time programmable Bipolar

                                                                    Major Manufacturers

                                                                    Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                    rivals Together they control over 80 percent of the market with Xilinx alone

                                                                    representing over 50 percent

                                                                    Xilinx also provides free Windows and Linux design software while Altera

                                                                    provides free Windows tools the Solaris and Linux tools are only available via

                                                                    a rental scheme

                                                                    Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                    (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                    Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                    focused CSSP no general purpose FPGAs)

                                                                    125 FPGA prototype

                                                                    FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                    prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                    hardware verification and early software development

                                                                    Main stream verification methods for hardware design and early software and

                                                                    firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                    design on FPGA has become a good method to do this

                                                                    64

                                                                    Reasons why Prototyping is important

                                                                    1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                    functionally correct This is compared to designers only relying on software

                                                                    simulations to verify that their hardware design is sound Simulation speed and

                                                                    modeling accuracy limitations hinder this development

                                                                    2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                    the foundry to start on software tests FPGA prototyping allows for much more

                                                                    time in area of software development and testing at the software-hardware

                                                                    integration stage This allows many unforeseen software bugs that appear due

                                                                    to todays array of operating systems applications and hardware

                                                                    3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                    system work well together off the simulation stage and in actual form

                                                                    4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                    bringing in interest early This speeds up the overall development cycle and

                                                                    allows for more enhancement or improvement to the chip features as it would

                                                                    otherwise have been

                                                                    65

                                                                    13 SIMULATION RESULTS

                                                                    66

                                                                    GATE LEVEL

                                                                    FigVGA CONTROLLER

                                                                    67

                                                                    GATE LEVEL

                                                                    Fig SYNC UNIT

                                                                    68

                                                                    TECHNOLOGY SCHEMATIC

                                                                    69

                                                                    14 APPLICATIONS

                                                                    Motion Tracking Systems

                                                                    Image processing systems

                                                                    Displaying Systems

                                                                    141 CONCLUTION

                                                                    Hence by implementing FPGA taking a processor-generated picture

                                                                    (pixilated) from memory space the Controller provides digital RGB values for

                                                                    each pixel as well as horizontal and vertical synchronization signals in order

                                                                    to correctly display the picture on a connected monitor

                                                                    70

                                                                    REFERENCES

                                                                    1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                    2 A VGA display controller by Eduardo Sanchez

                                                                    3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                    71

                                                                    • Pixel
                                                                      • 6 SIGNALS IN VGA CONTROLLER
                                                                      • Gates
                                                                      • Market size
                                                                      • 122 FPGA Comparisons

                                                                      Segments of VHDL code that can be compiled separately and stored in a

                                                                      library

                                                                      Entities

                                                                      A black box with interface definition

                                                                      Defines the inputsoutputs of a component (define pins)

                                                                      A way to represent modularity in VHDL

                                                                      Similar to symbol in schematic

                                                                      Entity declaration describes entity

                                                                      Eg

                                                                      entity Comparator is

                                                                      port (A B in std_logic_vector(7 downto0)

                                                                      EQ out std_logic)

                                                                      end Comparator

                                                                      Ports Provide channels of communication between the component and its

                                                                      environment

                                                                      Each port must have a name direction and a type

                                                                      An entity may have NO port declaration

                                                                      35

                                                                      Port directions In A value of a port can be read inside the component but cannot be assigned

                                                                      Multiple reads of port are allowed

                                                                      Out Assignments can be made to a port but data from a port cannot be read

                                                                      Multiple assignments are allowed

                                                                      In out Bi-directional assignments can be made and data can be read Multiple

                                                                      assignments are allowed

                                                                      Buffer An out port with read capability May have at most one assignment

                                                                      (are not recommended)

                                                                      Architectures Every entity has at least one architecture

                                                                      One entity can have several architectures

                                                                      Architectures can describe design using

                                                                      BehaviorndashStructurendashDataflow

                                                                      Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                                      Transfer Level)ndashBehavioral level

                                                                      Configuration declaration links architecture to entity

                                                                      Eg

                                                                      Architecture Comparator1 of Comparator is

                                                                      Begin

                                                                      EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                                      End Comparator1

                                                                      Configurations Links entity declaration and architecture body together

                                                                      Concept of default configuration is a bit messy in VHDL lsquo87

                                                                      ndashLast architecture analyzed links to entity

                                                                      Can be used to change simulation behavior without re-analyzing the VHDL

                                                                      source

                                                                      Complex configuration declarations are ignored in synthesis

                                                                      36

                                                                      Some entities can have eggate level architecture and behavioral

                                                                      architecture

                                                                      Are always optional

                                                                      PackagesPackages contain information common to many design units

                                                                      1 Package declaration

                                                                      --constant declarations

                                                                      ndashtype and subtype declarations

                                                                      ndashfunction and procedure declarations

                                                                      ndashglobal signal declarations

                                                                      ndashfile declarations

                                                                      ndashcomponent declarations

                                                                      2 Package body

                                                                      ndashis not necessary needed

                                                                      ndashfunction bodies

                                                                      ndashprocedure bodies

                                                                      Packages are meant for encapsuling data which can be shared globally among

                                                                      Several design units

                                                                      Consists of declaration part and optional body part

                                                                      Package declaration can contain

                                                                      ndashtype and subtype declarations

                                                                      ndashsubprograms

                                                                      ndashconstants

                                                                      Alias declarations

                                                                      ndashglobal signal declarations

                                                                      ndashfile declarations

                                                                      ndashcomponent declarations

                                                                      37

                                                                      Package body consists of

                                                                      ndashsubprogram declarations and bodies

                                                                      ndashtype and subtype declarations

                                                                      ndash deferred constants

                                                                      ndash file declarations

                                                                      LibrariesCollection of VHDL design units (database)

                                                                      1 Packages

                                                                      package declaration

                                                                      package body

                                                                      2 Entities (entity declaration)

                                                                      3 Architectures (architecture body)

                                                                      4 Configurations (configuration declarations)

                                                                      Usually directory in UNIX file system

                                                                      Can be also any other kind of database

                                                                      Levels of Abstraction

                                                                      VHDL supports many possible styles of design description which differ

                                                                      primarily in how closely they relate to the HW

                                                                      It is possible to describe a circuit in a number of ways

                                                                      Structural-------

                                                                      Dataflow ------- Higher level of abstraction

                                                                      Behavioral -------

                                                                      Structural VHDL description

                                                                      Circuit is described in terms of its components

                                                                      From a low-level description (eg transistor-level description) to a high level

                                                                      description (eg block diagram)

                                                                      For large circuits a low-level description quickly becomes impractical

                                                                      103 Dataflow VHDL Description

                                                                      38

                                                                      Circuit is described in terms of how data moves through the system

                                                                      In the dataflow style you describe how information flows between registers in

                                                                      the system

                                                                      The combinational logic is described at a relatively high level the placement

                                                                      and operation of registers is specified quite precisely

                                                                      The behavior of the system over the time is defined by registers

                                                                      There are no build-in registers in VHDL-language

                                                                      ndashEither lower level description

                                                                      ndashor behavioral description of sequential elements is needed

                                                                      The lower level register descriptions must be created or obtained

                                                                      If there is no 3rd party models for registers =gt you must write the behavioral

                                                                      description of registers

                                                                      The behavioral description can be provided in the form of

                                                                      subprograms(functions or procedures)

                                                                      104 Behavioral VHDL Description

                                                                      Circuit is described in terms of its operation over time

                                                                      39

                                                                      Representation might include eg state diagrams timing diagrams and

                                                                      algorithmic descriptions

                                                                      The concept of time may be expressed precisely using delays (eg A lt= B

                                                                      after 10 ns)

                                                                      If no actual delay is used order of sequential operations is defined

                                                                      In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                                      timing specifications

                                                                      The actual timing results depend on implementation technology and efficiency

                                                                      of synthesis tool

                                                                      There are a few tools for behavioral synthesis

                                                                      Concurrent Vs Sequential

                                                                      Processes

                                                                      Basic simulation concept in VHDL

                                                                      VHDL description can always be broken up to interconnected processes

                                                                      Quite similar to Unix process

                                                                      40

                                                                      Process keyword in VHDL

                                                                      Process statement is concurrent statement

                                                                      Statements inside process statements are sequential statements

                                                                      Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                      Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                      General format

                                                                      Process [(sensitivity_list)]

                                                                      process_declarative_part

                                                                      Begin

                                                                      process_statements

                                                                      [wait_statement]

                                                                      End process

                                                                      41

                                                                      11 VGA CONTROLLER CODE

                                                                      library IEEE

                                                                      use IEEESTD_LOGIC_1164ALL

                                                                      use IEEESTD_LOGIC_ARITHALL

                                                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                                                      use IEEEnumeric_stdALL

                                                                      ---- Uncomment the following library declaration if instantiating

                                                                      ---- any Xilinx primitives in this code

                                                                      --library UNISIM

                                                                      --use UNISIMVComponentsall

                                                                      entity vga_controller is port(clk50min std_logic

                                                                      rstnin std_logic

                                                                      hsyncout std_logic

                                                                      vsyncout std_logic

                                                                      redout std_logic_vector(0 downto 0)

                                                                      greenout std_logic_vector(0 downto 0)

                                                                      blueout std_logic_vector(0 downto 0))

                                                                      end vga_controller

                                                                      architecture Behavioral of vga_controller is

                                                                      component memory

                                                                      port (

                                                                      addr IN std_logic_VECTOR(17 downto 0)

                                                                      clk IN std_logic

                                                                      dout OUT std_logic_VECTOR(0 downto 0)

                                                                      en IN std_logic

                                                                      sinit IN std_logic)

                                                                      END component

                                                                      42

                                                                      component sync_unit port(

                                                                      clkin std_logic

                                                                      rstnin std_logic

                                                                      hsyncout std_logic

                                                                      ---horizontal synch pulse(row)

                                                                      vsyncout std_logic

                                                                      --vertical sync pulse(frame) 111110100000000000

                                                                      v_en_regout std_logic

                                                                      h_enout std_logic

                                                                      v_enout std_logic

                                                                      line_countinout integer

                                                                      c_hsinout integer)

                                                                      end component

                                                                      component pixelgen port(

                                                                      clkin std_logic

                                                                      rstnin std_logic

                                                                      hsyncin std_logic

                                                                      vsyncin std_logic

                                                                      h_enin std_logic

                                                                      v_enin std_logic

                                                                      v_en_reg in std_logic

                                                                      data_inin std_logic_vector(0 downto 0)

                                                                      line_countin integer

                                                                      c_hsin integer

                                                                      red out std_logic_vector(0 downto 0)

                                                                      greenout std_logic_vector(0 downto 0)

                                                                      blueout std_logic_vector(0 downto 0)

                                                                      addrout std_logic_vector(17 downto 0)

                                                                      rdout std_logic)

                                                                      end component

                                                                      43

                                                                      signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                      signal line_count_sc_hs_saddrinteger

                                                                      signal data_sstd_logic_vector(0 downto 0)

                                                                      signal addr_sstd_logic_vector(17 downto 0)

                                                                      begin

                                                                      --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                      process(clk50mrstn)

                                                                      begin

                                                                      if(rstn = 0)then

                                                                      clk lt= 0

                                                                      elsif(clk50m = 1 and clk50mevent)then

                                                                      clk lt= not clk

                                                                      end if

                                                                      --end if

                                                                      end process

                                                                      hsync lt= hsync_s

                                                                      vsync lt= vsync_s

                                                                      sync sync_unit port map(

                                                                      clk =gt clk

                                                                      rstn =gt rstn

                                                                      hsync =gt hsync_s

                                                                      vsync =gt vsync_s

                                                                      v_en_reg =gt v_en_reg_s

                                                                      line_count =gt line_count_s

                                                                      h_en =gt h_en_s

                                                                      v_en =gt v_en_s

                                                                      c_hs =gt c_hs_s

                                                                      )

                                                                      44

                                                                      pixel pixelgen port map(

                                                                      clk =gt clk

                                                                      rstn =gt rstn

                                                                      hsync =gt hsync_s

                                                                      vsync =gt vsync_s

                                                                      h_en =gt h_en_s

                                                                      v_en =gt v_en_s

                                                                      v_en_reg =gt v_en_reg_s

                                                                      data_in =gt data_s

                                                                      line_count=gt line_count_s

                                                                      c_hs =gt c_hs_s

                                                                      red =gt red

                                                                      green =gt green

                                                                      blue =gt blue

                                                                      addr =gt addr_s

                                                                      rd =gt rd_s

                                                                      )

                                                                      mem12 memory port map(

                                                                      addr =gt addr_s

                                                                      clk =gt clk

                                                                      dout =gt data_s

                                                                      en =gt rd_s

                                                                      sinit=gt rstn

                                                                      )

                                                                      end Behavioral

                                                                      45

                                                                      SYNCHRONIZATION BLOCK CODE

                                                                      library IEEE

                                                                      use IEEESTD_LOGIC_1164ALL

                                                                      use IEEESTD_LOGIC_ARITHALL

                                                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                                                      entity sync_unit is port(

                                                                      clkin std_logic

                                                                      rstnin std_logic

                                                                      hsyncout std_logic ---horizontal synch pulse(row)

                                                                      vsyncout std_logic --vertical sync

                                                                      pulse(frame) 111110100000000000

                                                                      v_en_regout std_logic

                                                                      h_enout std_logic

                                                                      v_enout std_logic

                                                                      line_countout integer

                                                                      c_hsout integer )

                                                                      end sync_unit

                                                                      architecture arch_sync_unit of sync_unit is

                                                                      signal h_en_sv_en_sstd_logic

                                                                      signal line_count_sc_hs_sinteger

                                                                      begin

                                                                      c_hs lt= c_hs_s

                                                                      line_count lt= line_count_s

                                                                      h_en lt= h_en_s

                                                                      v_en lt= v_en_s

                                                                      -- hsync counter

                                                                      process(clkrstn)

                                                                      begin

                                                                      if(rstn = 0)then

                                                                      46

                                                                      c_hs_s lt= 0

                                                                      else

                                                                      if(clk = 1 and clkevent)then

                                                                      if(c_hs_s=793)then

                                                                      c_hs_s lt= 0

                                                                      else

                                                                      c_hs_s lt= c_hs_s +1

                                                                      end if

                                                                      end if

                                                                      end if

                                                                      end process

                                                                      --vsync counter

                                                                      --vertical line counter

                                                                      process(h_en_srstn)

                                                                      begin

                                                                      if(rstn = 0)then

                                                                      line_count_s lt= 0

                                                                      elsif(h_en_s = 1 and h_en_sevent)then

                                                                      line_count_s lt= line_count_s + 1

                                                                      if (line_count_s=515)then

                                                                      line_count_s lt= 0

                                                                      end if

                                                                      end if

                                                                      --end if

                                                                      end process

                                                                      --hysnc pulse

                                                                      process(clkrstn)

                                                                      begin

                                                                      if(rstn = 0)then

                                                                      47

                                                                      hsync lt= 1

                                                                      elsif(clk = 1 and clkevent)then

                                                                      if(c_hs_s lt=95)then

                                                                      hsync lt= 0

                                                                      else

                                                                      hsync lt= 1

                                                                      end if

                                                                      end if

                                                                      --end if

                                                                      end process

                                                                      process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                      this period)

                                                                      begin

                                                                      if(rstn = 0)then

                                                                      h_en_s lt= 0

                                                                      elsif(clk = 1 and clkevent)then

                                                                      if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                      h_en_s lt= 1

                                                                      else

                                                                      h_en_s lt= 0

                                                                      end if

                                                                      end if

                                                                      --end if

                                                                      end process

                                                                      ---vysnc pulse

                                                                      process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                      begin

                                                                      if(rstn = 0)then

                                                                      48

                                                                      v_en_s lt= 0

                                                                      elsif(clk = 1 and clkevent)then

                                                                      if((line_count_sgt=35 and line_count_slt=515))then

                                                                      v_en_s lt= 1

                                                                      else

                                                                      v_en_s lt= 0

                                                                      end if

                                                                      end if

                                                                      --end if

                                                                      end process

                                                                      process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                      begin

                                                                      if(rstn = 0)then

                                                                      vsync lt= 1

                                                                      elsif(clk = 1 and clkevent)then

                                                                      if(line_count_s lt2)then

                                                                      vsync lt= 0

                                                                      else

                                                                      vsync lt= 1

                                                                      end if

                                                                      end if

                                                                      --end if

                                                                      end process

                                                                      process(clk)

                                                                      begin

                                                                      if(clk = 1 and clkevent)then

                                                                      v_en_reglt=v_en_s

                                                                      end if

                                                                      end process

                                                                      end arch_sync_unit

                                                                      49

                                                                      PIXEL GENERATOR

                                                                      library IEEE

                                                                      use IEEESTD_LOGIC_1164ALL

                                                                      use IEEESTD_LOGIC_ARITHALL

                                                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                                                      entity pixelgen is port(

                                                                      clkin std_logic

                                                                      rstnin std_logic

                                                                      hsyncin std_logic

                                                                      vsyncin std_logic

                                                                      h_enin std_logic

                                                                      v_enin std_logic

                                                                      v_en_reg in std_logic

                                                                      data_inin std_logic_vector(0 downto 0)

                                                                      line_countin integer

                                                                      c_hsin integer

                                                                      red out std_logic_vector(0 downto 0)

                                                                      greenout std_logic_vector(0 downto 0)

                                                                      blueout std_logic_vector(0 downto 0)

                                                                      addrout std_logic_vector(17 downto 0)

                                                                      rdout std_logic

                                                                      )

                                                                      end pixelgen

                                                                      architecture behv of pixelgen is

                                                                      signal addr_sstd_logic_vector(17 downto 0)

                                                                      begin

                                                                      addr lt= addr_s

                                                                      process(clkrstn)

                                                                      begin

                                                                      50

                                                                      if(rstn = 0)then

                                                                      rd lt= 0

                                                                      red lt= 0

                                                                      green lt= 0

                                                                      blue lt= 0

                                                                      addr_s lt= x0000 amp 00

                                                                      elsif (clk = 1 and clkevent)then

                                                                      if((v_en=1) and (v_en_reg=0))then

                                                                      addr_slt=(others =gt0)

                                                                      end if

                                                                      if(v_en=1 and h_en=1)then

                                                                      if(line_countlt=514)then

                                                                      if(c_hs lt= 655)then

                                                                      rd lt= 1

                                                                      red lt= data_in

                                                                      green lt= data_in

                                                                      blue lt= data_in

                                                                      addr_s lt=addr_s +000000000000000001

                                                                      else

                                                                      rd lt= 0

                                                                      red lt= 0

                                                                      green lt= 1

                                                                      blue lt= 0

                                                                      end if

                                                                      else

                                                                      rd lt= 0

                                                                      red lt= 1

                                                                      green lt= 1

                                                                      blue lt= 1

                                                                      51

                                                                      end if

                                                                      if(addr_s=111011111111111111)then

                                                                      addr_slt= (others =gt0)

                                                                      end if

                                                                      else

                                                                      red lt= 0

                                                                      green lt= 0

                                                                      blue lt= 0

                                                                      end if

                                                                      end if

                                                                      -- end if

                                                                      --end if

                                                                      end process

                                                                      end behv

                                                                      52

                                                                      12 FPGA ARCHITECTURE

                                                                      A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                      be configured by the customer or designer after manufacturingmdashhence the

                                                                      name field-programmable FPGAs are programmed using a logic circuit

                                                                      diagram or a source code in a hardware description language (HDL) to specify

                                                                      how the chip will work They can be used to implement any logical function

                                                                      that an application-specific integrated circuit (ASIC) could perform but the

                                                                      ability to update the functionality after shipping offers advantages for many

                                                                      applications

                                                                      FPGAs contain programmable logic components called logic blocks and a

                                                                      hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                      togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                      can be configured to perform complex combinational functions or merely

                                                                      simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                      include memory elements which may be simple flip-flops or more complete

                                                                      blocks of memory

                                                                      History The FPGA industry sprouted from programmable read only

                                                                      memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                      both had the option of being programmed in batches in a factory or in the field

                                                                      (field programmable) however programmable logic was hard-wired between

                                                                      logic gates

                                                                      Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                      first commercially viable field programmable gate array in 1985 ndash the

                                                                      XC2064 The XC2064 had programmable gates and programmable

                                                                      interconnects between gates the beginnings of a new technology and market

                                                                      The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                      input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                      into the National Inventors Hall of Fame for his invention

                                                                      53

                                                                      Some of the industryrsquos foundational concepts and technologies for

                                                                      programmable logic arrays gates and logic blocks are founded in patents

                                                                      awarded to David W Page and LuVerne R Peterson in 1985

                                                                      In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                      proposed by Steve Casselman to develop a computer that would implement

                                                                      600000 reprogrammable gates Casselman was successful and the system was

                                                                      awarded a patent in 1992

                                                                      Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                      1990s when competitors sprouted up eroding significant market-share By

                                                                      1993 Actel was serving about 18 percent of the market

                                                                      The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                      and the volume of production In the early 1990s FPGAs were primarily used

                                                                      in telecommunications and networking By the end of the decade FPGAs

                                                                      found their way into consumer automotive and industrial applications

                                                                      FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                      algorithm technology and FPGAs to create a sound recognition device

                                                                      Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                      to decide the configuration needed to accomplish a sound recognition task

                                                                      121 Modern developments

                                                                      A recent trend has been to take the coarse-grained architectural

                                                                      approach a step further by combining the logic blocks and interconnects of

                                                                      traditional FPGAs with embedded microprocessors and related peripherals to

                                                                      form a complete system on a programmable chip This work mirrors the

                                                                      architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                      Group which combined a reconfigurable CPU architecture on a single chip

                                                                      called the SB24 That work was done in 1982 Examples of such hybrid

                                                                      technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                      which include one or more PowerPC processors embedded within the FPGAs

                                                                      54

                                                                      logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                      processor in combination with Atmels programmable logic architecture

                                                                      An alternate approach to using hard-macro processors is to make use of soft

                                                                      processor cores that are implemented within the FPGA logic (See Soft

                                                                      processors below)

                                                                      As previously mentioned many modern FPGAs have the ability to be

                                                                      reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                      computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                      suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                      example of a reconfigurable soft processor implemented on FPGAs However

                                                                      it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                      to a specific program

                                                                      Additionally new non-FPGA architectures are beginning to emerge

                                                                      Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                      hybrid approach by providing an array of processor cores and FPGA-like

                                                                      programmable cores on the same chip

                                                                      Gates

                                                                      1987 9000 gates Xilinx

                                                                      1992 600000 Naval Surface Warfare Department

                                                                      Early 2000s Millions

                                                                      Market size

                                                                      1985 First commercial FPGA technology invented by Xilinx

                                                                      1987 $14 million

                                                                      ~1993 gt$385 million

                                                                      2005 $19 billion

                                                                      2010 estimates $275 billion

                                                                      55

                                                                      122 FPGA Comparisons

                                                                      Historically FPGAs have been slower less energy efficient and generally

                                                                      achieved less functionality than their fixed ASIC counterparts A combination

                                                                      of volume fabrication improvements research and development and the IO

                                                                      capabilities of new supercomputers have largely closed the performance gap

                                                                      between ASICs and FPGAs

                                                                      Advantages include a shorter time to market ability to re-program in the field

                                                                      to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                      middle road by developing their hardware on ordinary FPGAs but

                                                                      manufacture their final version so it can no longer be modified after the design

                                                                      has been committed

                                                                      Xilinx claims that several market and technology dynamics are changing the

                                                                      ASICFPGA paradigm

                                                                      IC costs are rising aggressively

                                                                      ASIC complexity has bolstered development time and costs

                                                                      RampD resources and headcount is decreasing

                                                                      Revenue losses for slow time-to-market are increasing

                                                                      Financial constraints in a poor economy are driving low-cost technologies

                                                                      These trends make FPGAs a better alternative than ASICs for a growing

                                                                      number of higher-volume applications than they have been historically used

                                                                      for which the company blames for the growing number of FPGA design starts

                                                                      (see History)

                                                                      The primary differences between CPLDs and FPGAs are architectural A

                                                                      CPLD has a somewhat restrictive structure consisting of one or more

                                                                      programmable sum-of-products logic arrays feeding a relatively small number

                                                                      of clocked registers The result of this is less flexibility with the advantage of

                                                                      more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                      FPGA architectures on the other hand are dominated by interconnect This

                                                                      56

                                                                      makes them far more flexible (in terms of the range of designs that are

                                                                      practical for implementation within them) but also far more complex to design

                                                                      for

                                                                      Another notable difference between CPLDs and FPGAs is the presence in most

                                                                      FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                      and embedded memories as well as to have logic blocks implement decoders

                                                                      or mathematical functions

                                                                      Some FPGAs have the capability of partial re-configuration that lets one

                                                                      portion of the device be re-programmed while other portions continue running

                                                                      123 Applications

                                                                      Applications of FPGAs include digital signal processing software-defined

                                                                      radio aerospace and defense systems ASIC prototyping medical imaging

                                                                      computer vision speech recognition cryptography bioinformatics computer

                                                                      hardware emulation radio astronomy and a growing range of other areas

                                                                      FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                      space that of glue logic for PCBs As their size capabilities and speed

                                                                      increased they began to take over larger and larger functions to the state where

                                                                      some are now marketed as full systems on chips (SoC) Particularly with the

                                                                      introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                      applications which had traditionally been the sole reserve of DSPs began to

                                                                      incorporate FPGAs instead

                                                                      FPGAs especially find applications in any area or algorithm that can make use

                                                                      of the massive parallelism offered by their architecture One such area is code

                                                                      breaking in particular brute-force attack of cryptographic algorithms

                                                                      FPGAs are increasingly used in conventional high performance computing

                                                                      applications where computational kernels such as FFT or Convolution are

                                                                      performed on the FPGA instead of a microprocessor

                                                                      57

                                                                      The inherent parallelism of the logic resources on an FPGA allows for

                                                                      considerable computational throughput even at a low MHz clock rates The

                                                                      flexibility of the FPGA allows for even higher performance by trading off

                                                                      precision and range in the number format for an increased number of parallel

                                                                      arithmetic units This has driven a new type of processing called

                                                                      reconfigurable computing where time intensive tasks are offloaded from

                                                                      software to FPGAs

                                                                      The adoption of FPGAs in high performance computing is currently limited by

                                                                      the complexity of FPGA design compared to conventional software and the

                                                                      extremely long turn-around times of current design tools where 4-8 hours wait

                                                                      is necessary after even minor changes to the source code

                                                                      Traditionally FPGAs have been reserved for specific vertical applications

                                                                      where the volume of production is small For these low-volume applications

                                                                      the premium that companies pay in hardware costs per unit for a

                                                                      programmable chip is more affordable than the development resources spent

                                                                      on creating an ASIC for a low-volume application Today new cost and

                                                                      performance dynamics have broadened the range of viable applications

                                                                      Architecture

                                                                      The most common FPGA architecture consists of an array of configurable

                                                                      logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                      channels have the same width (number of wires) Multiple IO pads may fit

                                                                      into the height of one row or the width of one column in the array

                                                                      An application circuit must be mapped into an FPGA with adequate resources

                                                                      While the number of CLBs and IOs required is easily determined from the

                                                                      design the number of routing tracks needed may vary considerably even

                                                                      among designs with the same amount of logic (For example a crossbar switch

                                                                      requires much more routing than a systolic array with the same gate count)

                                                                      Since unused routing tracks increase the cost (and decrease the performance)

                                                                      of the part without providing any benefit FPGA manufacturers try to provide

                                                                      58

                                                                      just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                      can be routed This is determined by estimates such as those derived from

                                                                      Rents rule or by experiments with existing designs

                                                                      The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                      blocks and routing channels Two IO pads fit into the height of one row or the

                                                                      width of one column as shown below All the routing channels have the same

                                                                      width (number of wires)

                                                                      121 FPGA structure

                                                                      A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                      flip-flop as shown below In recent years manufacturers have started moving

                                                                      to 6-input LUTs in their high performance parts claiming increased

                                                                      performance

                                                                      Typical logic block

                                                                      59

                                                                      There is only one output which can be either the registered or the unregistered

                                                                      LUT output The logic block has four inputs for the LUT and a clock input

                                                                      Since clock signals (and often other high-fanout signals) are normally routed

                                                                      via special-purpose dedicated routing networks in commercial FPGAs they

                                                                      and other signals are separately managed

                                                                      For this example architecture the locations of the FPGA logic block pins are

                                                                      shown below

                                                                      Logic Block Pin Locations

                                                                      Each input is accessible from one side of the logic block while the output pin

                                                                      can connect to routing wires in both the channel to the right and the channel

                                                                      below the logic block

                                                                      Each logic block output pin can connect to any of the wiring segments in the

                                                                      channels adjacent to it

                                                                      Similarly an IO pad can connect to any one of the wiring segments in the

                                                                      channel adjacent to it For example an IO pad at the top of the chip can

                                                                      connect to any of the W wires (where W is the channel width) in the horizontal

                                                                      channel immediately below it

                                                                      Generally the FPGA routing is unsegmented That is each wiring segment

                                                                      spans only one logic block before it terminates in a switch box By turning on

                                                                      60

                                                                      some of the programmable switches within a switch box longer paths can be

                                                                      constructed For higher speed interconnect some FPGA architectures use

                                                                      longer routing lines that span multiple logic blocks

                                                                      Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                      this architecture when a wire enters a switch box there are three

                                                                      programmable switches that allow it to connect to three other wires in adjacent

                                                                      channel segments The pattern or topology of switches used in this

                                                                      architecture is the planar or domain-based switch box topology In this switch

                                                                      box topology a wire in track number one connects only to wires in track

                                                                      number one in adjacent channel segments wires in track number 2 connect

                                                                      only to other wires in track number 2 and so on The figure below illustrates

                                                                      the connections in a switch box

                                                                      Switch box topology

                                                                      Modern FPGA families expand upon the above capabilities to include higher

                                                                      level functionality fixed into the silicon Having these common functions

                                                                      embedded into the silicon reduces the area required and gives those functions

                                                                      increased speed compared to building them from primitives Examples of these

                                                                      include multipliers generic DSP blocks embedded processors high speed IO

                                                                      logic and embedded memories

                                                                      61

                                                                      FPGAs are also widely used for systems validation including pre-silicon

                                                                      validation post-silicon validation and firmware development This allows chip

                                                                      companies to validate their design before the chip is produced in the factory

                                                                      reducing the time to market

                                                                      124 FPGA Design and Programming

                                                                      To define the behavior of the FPGA the user provides a hardware description

                                                                      language (HDL) or a schematic design The HDL form might be easier to work

                                                                      with when handling large structures because its possible to just specify them

                                                                      numerically rather than having to draw every piece by hand On the other hand

                                                                      schematic entry can allow for easier visualisation of a design

                                                                      Then using an electronic design automation tool a technology-mapped netlist

                                                                      is generated The netlist can then be fitted to the actual FPGA architecture

                                                                      using a process called place-and-route usually performed by the FPGA

                                                                      companys proprietary place-and-route software The user will validate the

                                                                      map place and route results via timing analysis simulation and other

                                                                      verification methodologies Once the design and validation process is

                                                                      complete the binary file generated (also using the FPGA companys

                                                                      proprietary software) is used to (re)configure the FPGA

                                                                      Going from schematicHDL source files to actual configuration The source

                                                                      files are fed to a software suite from the FPGACPLD vendor that through

                                                                      different steps will produce a file This file is then transferred to the

                                                                      FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                      like an EEPROM

                                                                      The most common HDLs are VHDL and Verilog although in an attempt to

                                                                      reduce the complexity of designing in HDLs which have been compared to the

                                                                      equivalent of assembly languages there are moves to raise the abstraction level

                                                                      through the introduction of alternative languages

                                                                      62

                                                                      To simplify the design of complex systems in FPGAs there exist libraries of

                                                                      predefined complex functions and circuits that have been tested and optimized

                                                                      to speed up the design process These predefined circuits are commonly called

                                                                      IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                      (rarely free and typically released under proprietary licenses) Other

                                                                      predefined circuits are available from developer communities such as Open

                                                                      Cores (typically free and released under the GPL BSD or similar license) and

                                                                      other sources

                                                                      In a typical design flow an FPGA application developer will simulate the

                                                                      design at multiple stages throughout the design process Initially the RTL

                                                                      description in VHDL or Verilog is simulated by creating test benches to

                                                                      simulate the system and observe results Then after the synthesis engine has

                                                                      mapped the design to a netlist the netlist is translated to a gate level

                                                                      description where simulation is repeated to confirm the synthesis proceeded

                                                                      without errors Finally the design is laid out in the FPGA at which point

                                                                      propagation delays can be added and the simulation run again with these

                                                                      values back-annotated onto the netlist

                                                                      Basic Process Technology Types

                                                                      SRAM - based on static memory technology In-system programmable and re-

                                                                      programmable Requires external boot devices CMOS

                                                                      Antifuse - One-time programmable CMOS

                                                                      EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                      one-time programmable in production because of plastic packaging

                                                                      Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                      EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                      technology Can be erased even in plastic packages Some but not all

                                                                      EEPROM devices can be in-system programmed CMOS

                                                                      Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                      packages Some but not all flash devices can be in-system programmed

                                                                      63

                                                                      Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                      therefore less expensive to manufacture CMOS

                                                                      Fuse - One-time programmable Bipolar

                                                                      Major Manufacturers

                                                                      Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                      rivals Together they control over 80 percent of the market with Xilinx alone

                                                                      representing over 50 percent

                                                                      Xilinx also provides free Windows and Linux design software while Altera

                                                                      provides free Windows tools the Solaris and Linux tools are only available via

                                                                      a rental scheme

                                                                      Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                      (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                      Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                      focused CSSP no general purpose FPGAs)

                                                                      125 FPGA prototype

                                                                      FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                      prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                      hardware verification and early software development

                                                                      Main stream verification methods for hardware design and early software and

                                                                      firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                      design on FPGA has become a good method to do this

                                                                      64

                                                                      Reasons why Prototyping is important

                                                                      1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                      functionally correct This is compared to designers only relying on software

                                                                      simulations to verify that their hardware design is sound Simulation speed and

                                                                      modeling accuracy limitations hinder this development

                                                                      2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                      the foundry to start on software tests FPGA prototyping allows for much more

                                                                      time in area of software development and testing at the software-hardware

                                                                      integration stage This allows many unforeseen software bugs that appear due

                                                                      to todays array of operating systems applications and hardware

                                                                      3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                      system work well together off the simulation stage and in actual form

                                                                      4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                      bringing in interest early This speeds up the overall development cycle and

                                                                      allows for more enhancement or improvement to the chip features as it would

                                                                      otherwise have been

                                                                      65

                                                                      13 SIMULATION RESULTS

                                                                      66

                                                                      GATE LEVEL

                                                                      FigVGA CONTROLLER

                                                                      67

                                                                      GATE LEVEL

                                                                      Fig SYNC UNIT

                                                                      68

                                                                      TECHNOLOGY SCHEMATIC

                                                                      69

                                                                      14 APPLICATIONS

                                                                      Motion Tracking Systems

                                                                      Image processing systems

                                                                      Displaying Systems

                                                                      141 CONCLUTION

                                                                      Hence by implementing FPGA taking a processor-generated picture

                                                                      (pixilated) from memory space the Controller provides digital RGB values for

                                                                      each pixel as well as horizontal and vertical synchronization signals in order

                                                                      to correctly display the picture on a connected monitor

                                                                      70

                                                                      REFERENCES

                                                                      1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                      2 A VGA display controller by Eduardo Sanchez

                                                                      3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                      71

                                                                      • Pixel
                                                                        • 6 SIGNALS IN VGA CONTROLLER
                                                                        • Gates
                                                                        • Market size
                                                                        • 122 FPGA Comparisons

                                                                        Port directions In A value of a port can be read inside the component but cannot be assigned

                                                                        Multiple reads of port are allowed

                                                                        Out Assignments can be made to a port but data from a port cannot be read

                                                                        Multiple assignments are allowed

                                                                        In out Bi-directional assignments can be made and data can be read Multiple

                                                                        assignments are allowed

                                                                        Buffer An out port with read capability May have at most one assignment

                                                                        (are not recommended)

                                                                        Architectures Every entity has at least one architecture

                                                                        One entity can have several architectures

                                                                        Architectures can describe design using

                                                                        BehaviorndashStructurendashDataflow

                                                                        Architectures can describe design on many levelsndashGate levelndashRTL (Register

                                                                        Transfer Level)ndashBehavioral level

                                                                        Configuration declaration links architecture to entity

                                                                        Eg

                                                                        Architecture Comparator1 of Comparator is

                                                                        Begin

                                                                        EQ lt= rsquo1rsquowhen (A=B) else rsquo0rsquo

                                                                        End Comparator1

                                                                        Configurations Links entity declaration and architecture body together

                                                                        Concept of default configuration is a bit messy in VHDL lsquo87

                                                                        ndashLast architecture analyzed links to entity

                                                                        Can be used to change simulation behavior without re-analyzing the VHDL

                                                                        source

                                                                        Complex configuration declarations are ignored in synthesis

                                                                        36

                                                                        Some entities can have eggate level architecture and behavioral

                                                                        architecture

                                                                        Are always optional

                                                                        PackagesPackages contain information common to many design units

                                                                        1 Package declaration

                                                                        --constant declarations

                                                                        ndashtype and subtype declarations

                                                                        ndashfunction and procedure declarations

                                                                        ndashglobal signal declarations

                                                                        ndashfile declarations

                                                                        ndashcomponent declarations

                                                                        2 Package body

                                                                        ndashis not necessary needed

                                                                        ndashfunction bodies

                                                                        ndashprocedure bodies

                                                                        Packages are meant for encapsuling data which can be shared globally among

                                                                        Several design units

                                                                        Consists of declaration part and optional body part

                                                                        Package declaration can contain

                                                                        ndashtype and subtype declarations

                                                                        ndashsubprograms

                                                                        ndashconstants

                                                                        Alias declarations

                                                                        ndashglobal signal declarations

                                                                        ndashfile declarations

                                                                        ndashcomponent declarations

                                                                        37

                                                                        Package body consists of

                                                                        ndashsubprogram declarations and bodies

                                                                        ndashtype and subtype declarations

                                                                        ndash deferred constants

                                                                        ndash file declarations

                                                                        LibrariesCollection of VHDL design units (database)

                                                                        1 Packages

                                                                        package declaration

                                                                        package body

                                                                        2 Entities (entity declaration)

                                                                        3 Architectures (architecture body)

                                                                        4 Configurations (configuration declarations)

                                                                        Usually directory in UNIX file system

                                                                        Can be also any other kind of database

                                                                        Levels of Abstraction

                                                                        VHDL supports many possible styles of design description which differ

                                                                        primarily in how closely they relate to the HW

                                                                        It is possible to describe a circuit in a number of ways

                                                                        Structural-------

                                                                        Dataflow ------- Higher level of abstraction

                                                                        Behavioral -------

                                                                        Structural VHDL description

                                                                        Circuit is described in terms of its components

                                                                        From a low-level description (eg transistor-level description) to a high level

                                                                        description (eg block diagram)

                                                                        For large circuits a low-level description quickly becomes impractical

                                                                        103 Dataflow VHDL Description

                                                                        38

                                                                        Circuit is described in terms of how data moves through the system

                                                                        In the dataflow style you describe how information flows between registers in

                                                                        the system

                                                                        The combinational logic is described at a relatively high level the placement

                                                                        and operation of registers is specified quite precisely

                                                                        The behavior of the system over the time is defined by registers

                                                                        There are no build-in registers in VHDL-language

                                                                        ndashEither lower level description

                                                                        ndashor behavioral description of sequential elements is needed

                                                                        The lower level register descriptions must be created or obtained

                                                                        If there is no 3rd party models for registers =gt you must write the behavioral

                                                                        description of registers

                                                                        The behavioral description can be provided in the form of

                                                                        subprograms(functions or procedures)

                                                                        104 Behavioral VHDL Description

                                                                        Circuit is described in terms of its operation over time

                                                                        39

                                                                        Representation might include eg state diagrams timing diagrams and

                                                                        algorithmic descriptions

                                                                        The concept of time may be expressed precisely using delays (eg A lt= B

                                                                        after 10 ns)

                                                                        If no actual delay is used order of sequential operations is defined

                                                                        In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                                        timing specifications

                                                                        The actual timing results depend on implementation technology and efficiency

                                                                        of synthesis tool

                                                                        There are a few tools for behavioral synthesis

                                                                        Concurrent Vs Sequential

                                                                        Processes

                                                                        Basic simulation concept in VHDL

                                                                        VHDL description can always be broken up to interconnected processes

                                                                        Quite similar to Unix process

                                                                        40

                                                                        Process keyword in VHDL

                                                                        Process statement is concurrent statement

                                                                        Statements inside process statements are sequential statements

                                                                        Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                        Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                        General format

                                                                        Process [(sensitivity_list)]

                                                                        process_declarative_part

                                                                        Begin

                                                                        process_statements

                                                                        [wait_statement]

                                                                        End process

                                                                        41

                                                                        11 VGA CONTROLLER CODE

                                                                        library IEEE

                                                                        use IEEESTD_LOGIC_1164ALL

                                                                        use IEEESTD_LOGIC_ARITHALL

                                                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                                                        use IEEEnumeric_stdALL

                                                                        ---- Uncomment the following library declaration if instantiating

                                                                        ---- any Xilinx primitives in this code

                                                                        --library UNISIM

                                                                        --use UNISIMVComponentsall

                                                                        entity vga_controller is port(clk50min std_logic

                                                                        rstnin std_logic

                                                                        hsyncout std_logic

                                                                        vsyncout std_logic

                                                                        redout std_logic_vector(0 downto 0)

                                                                        greenout std_logic_vector(0 downto 0)

                                                                        blueout std_logic_vector(0 downto 0))

                                                                        end vga_controller

                                                                        architecture Behavioral of vga_controller is

                                                                        component memory

                                                                        port (

                                                                        addr IN std_logic_VECTOR(17 downto 0)

                                                                        clk IN std_logic

                                                                        dout OUT std_logic_VECTOR(0 downto 0)

                                                                        en IN std_logic

                                                                        sinit IN std_logic)

                                                                        END component

                                                                        42

                                                                        component sync_unit port(

                                                                        clkin std_logic

                                                                        rstnin std_logic

                                                                        hsyncout std_logic

                                                                        ---horizontal synch pulse(row)

                                                                        vsyncout std_logic

                                                                        --vertical sync pulse(frame) 111110100000000000

                                                                        v_en_regout std_logic

                                                                        h_enout std_logic

                                                                        v_enout std_logic

                                                                        line_countinout integer

                                                                        c_hsinout integer)

                                                                        end component

                                                                        component pixelgen port(

                                                                        clkin std_logic

                                                                        rstnin std_logic

                                                                        hsyncin std_logic

                                                                        vsyncin std_logic

                                                                        h_enin std_logic

                                                                        v_enin std_logic

                                                                        v_en_reg in std_logic

                                                                        data_inin std_logic_vector(0 downto 0)

                                                                        line_countin integer

                                                                        c_hsin integer

                                                                        red out std_logic_vector(0 downto 0)

                                                                        greenout std_logic_vector(0 downto 0)

                                                                        blueout std_logic_vector(0 downto 0)

                                                                        addrout std_logic_vector(17 downto 0)

                                                                        rdout std_logic)

                                                                        end component

                                                                        43

                                                                        signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                        signal line_count_sc_hs_saddrinteger

                                                                        signal data_sstd_logic_vector(0 downto 0)

                                                                        signal addr_sstd_logic_vector(17 downto 0)

                                                                        begin

                                                                        --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                        process(clk50mrstn)

                                                                        begin

                                                                        if(rstn = 0)then

                                                                        clk lt= 0

                                                                        elsif(clk50m = 1 and clk50mevent)then

                                                                        clk lt= not clk

                                                                        end if

                                                                        --end if

                                                                        end process

                                                                        hsync lt= hsync_s

                                                                        vsync lt= vsync_s

                                                                        sync sync_unit port map(

                                                                        clk =gt clk

                                                                        rstn =gt rstn

                                                                        hsync =gt hsync_s

                                                                        vsync =gt vsync_s

                                                                        v_en_reg =gt v_en_reg_s

                                                                        line_count =gt line_count_s

                                                                        h_en =gt h_en_s

                                                                        v_en =gt v_en_s

                                                                        c_hs =gt c_hs_s

                                                                        )

                                                                        44

                                                                        pixel pixelgen port map(

                                                                        clk =gt clk

                                                                        rstn =gt rstn

                                                                        hsync =gt hsync_s

                                                                        vsync =gt vsync_s

                                                                        h_en =gt h_en_s

                                                                        v_en =gt v_en_s

                                                                        v_en_reg =gt v_en_reg_s

                                                                        data_in =gt data_s

                                                                        line_count=gt line_count_s

                                                                        c_hs =gt c_hs_s

                                                                        red =gt red

                                                                        green =gt green

                                                                        blue =gt blue

                                                                        addr =gt addr_s

                                                                        rd =gt rd_s

                                                                        )

                                                                        mem12 memory port map(

                                                                        addr =gt addr_s

                                                                        clk =gt clk

                                                                        dout =gt data_s

                                                                        en =gt rd_s

                                                                        sinit=gt rstn

                                                                        )

                                                                        end Behavioral

                                                                        45

                                                                        SYNCHRONIZATION BLOCK CODE

                                                                        library IEEE

                                                                        use IEEESTD_LOGIC_1164ALL

                                                                        use IEEESTD_LOGIC_ARITHALL

                                                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                                                        entity sync_unit is port(

                                                                        clkin std_logic

                                                                        rstnin std_logic

                                                                        hsyncout std_logic ---horizontal synch pulse(row)

                                                                        vsyncout std_logic --vertical sync

                                                                        pulse(frame) 111110100000000000

                                                                        v_en_regout std_logic

                                                                        h_enout std_logic

                                                                        v_enout std_logic

                                                                        line_countout integer

                                                                        c_hsout integer )

                                                                        end sync_unit

                                                                        architecture arch_sync_unit of sync_unit is

                                                                        signal h_en_sv_en_sstd_logic

                                                                        signal line_count_sc_hs_sinteger

                                                                        begin

                                                                        c_hs lt= c_hs_s

                                                                        line_count lt= line_count_s

                                                                        h_en lt= h_en_s

                                                                        v_en lt= v_en_s

                                                                        -- hsync counter

                                                                        process(clkrstn)

                                                                        begin

                                                                        if(rstn = 0)then

                                                                        46

                                                                        c_hs_s lt= 0

                                                                        else

                                                                        if(clk = 1 and clkevent)then

                                                                        if(c_hs_s=793)then

                                                                        c_hs_s lt= 0

                                                                        else

                                                                        c_hs_s lt= c_hs_s +1

                                                                        end if

                                                                        end if

                                                                        end if

                                                                        end process

                                                                        --vsync counter

                                                                        --vertical line counter

                                                                        process(h_en_srstn)

                                                                        begin

                                                                        if(rstn = 0)then

                                                                        line_count_s lt= 0

                                                                        elsif(h_en_s = 1 and h_en_sevent)then

                                                                        line_count_s lt= line_count_s + 1

                                                                        if (line_count_s=515)then

                                                                        line_count_s lt= 0

                                                                        end if

                                                                        end if

                                                                        --end if

                                                                        end process

                                                                        --hysnc pulse

                                                                        process(clkrstn)

                                                                        begin

                                                                        if(rstn = 0)then

                                                                        47

                                                                        hsync lt= 1

                                                                        elsif(clk = 1 and clkevent)then

                                                                        if(c_hs_s lt=95)then

                                                                        hsync lt= 0

                                                                        else

                                                                        hsync lt= 1

                                                                        end if

                                                                        end if

                                                                        --end if

                                                                        end process

                                                                        process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                        this period)

                                                                        begin

                                                                        if(rstn = 0)then

                                                                        h_en_s lt= 0

                                                                        elsif(clk = 1 and clkevent)then

                                                                        if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                        h_en_s lt= 1

                                                                        else

                                                                        h_en_s lt= 0

                                                                        end if

                                                                        end if

                                                                        --end if

                                                                        end process

                                                                        ---vysnc pulse

                                                                        process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                        begin

                                                                        if(rstn = 0)then

                                                                        48

                                                                        v_en_s lt= 0

                                                                        elsif(clk = 1 and clkevent)then

                                                                        if((line_count_sgt=35 and line_count_slt=515))then

                                                                        v_en_s lt= 1

                                                                        else

                                                                        v_en_s lt= 0

                                                                        end if

                                                                        end if

                                                                        --end if

                                                                        end process

                                                                        process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                        begin

                                                                        if(rstn = 0)then

                                                                        vsync lt= 1

                                                                        elsif(clk = 1 and clkevent)then

                                                                        if(line_count_s lt2)then

                                                                        vsync lt= 0

                                                                        else

                                                                        vsync lt= 1

                                                                        end if

                                                                        end if

                                                                        --end if

                                                                        end process

                                                                        process(clk)

                                                                        begin

                                                                        if(clk = 1 and clkevent)then

                                                                        v_en_reglt=v_en_s

                                                                        end if

                                                                        end process

                                                                        end arch_sync_unit

                                                                        49

                                                                        PIXEL GENERATOR

                                                                        library IEEE

                                                                        use IEEESTD_LOGIC_1164ALL

                                                                        use IEEESTD_LOGIC_ARITHALL

                                                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                                                        entity pixelgen is port(

                                                                        clkin std_logic

                                                                        rstnin std_logic

                                                                        hsyncin std_logic

                                                                        vsyncin std_logic

                                                                        h_enin std_logic

                                                                        v_enin std_logic

                                                                        v_en_reg in std_logic

                                                                        data_inin std_logic_vector(0 downto 0)

                                                                        line_countin integer

                                                                        c_hsin integer

                                                                        red out std_logic_vector(0 downto 0)

                                                                        greenout std_logic_vector(0 downto 0)

                                                                        blueout std_logic_vector(0 downto 0)

                                                                        addrout std_logic_vector(17 downto 0)

                                                                        rdout std_logic

                                                                        )

                                                                        end pixelgen

                                                                        architecture behv of pixelgen is

                                                                        signal addr_sstd_logic_vector(17 downto 0)

                                                                        begin

                                                                        addr lt= addr_s

                                                                        process(clkrstn)

                                                                        begin

                                                                        50

                                                                        if(rstn = 0)then

                                                                        rd lt= 0

                                                                        red lt= 0

                                                                        green lt= 0

                                                                        blue lt= 0

                                                                        addr_s lt= x0000 amp 00

                                                                        elsif (clk = 1 and clkevent)then

                                                                        if((v_en=1) and (v_en_reg=0))then

                                                                        addr_slt=(others =gt0)

                                                                        end if

                                                                        if(v_en=1 and h_en=1)then

                                                                        if(line_countlt=514)then

                                                                        if(c_hs lt= 655)then

                                                                        rd lt= 1

                                                                        red lt= data_in

                                                                        green lt= data_in

                                                                        blue lt= data_in

                                                                        addr_s lt=addr_s +000000000000000001

                                                                        else

                                                                        rd lt= 0

                                                                        red lt= 0

                                                                        green lt= 1

                                                                        blue lt= 0

                                                                        end if

                                                                        else

                                                                        rd lt= 0

                                                                        red lt= 1

                                                                        green lt= 1

                                                                        blue lt= 1

                                                                        51

                                                                        end if

                                                                        if(addr_s=111011111111111111)then

                                                                        addr_slt= (others =gt0)

                                                                        end if

                                                                        else

                                                                        red lt= 0

                                                                        green lt= 0

                                                                        blue lt= 0

                                                                        end if

                                                                        end if

                                                                        -- end if

                                                                        --end if

                                                                        end process

                                                                        end behv

                                                                        52

                                                                        12 FPGA ARCHITECTURE

                                                                        A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                        be configured by the customer or designer after manufacturingmdashhence the

                                                                        name field-programmable FPGAs are programmed using a logic circuit

                                                                        diagram or a source code in a hardware description language (HDL) to specify

                                                                        how the chip will work They can be used to implement any logical function

                                                                        that an application-specific integrated circuit (ASIC) could perform but the

                                                                        ability to update the functionality after shipping offers advantages for many

                                                                        applications

                                                                        FPGAs contain programmable logic components called logic blocks and a

                                                                        hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                        togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                        can be configured to perform complex combinational functions or merely

                                                                        simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                        include memory elements which may be simple flip-flops or more complete

                                                                        blocks of memory

                                                                        History The FPGA industry sprouted from programmable read only

                                                                        memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                        both had the option of being programmed in batches in a factory or in the field

                                                                        (field programmable) however programmable logic was hard-wired between

                                                                        logic gates

                                                                        Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                        first commercially viable field programmable gate array in 1985 ndash the

                                                                        XC2064 The XC2064 had programmable gates and programmable

                                                                        interconnects between gates the beginnings of a new technology and market

                                                                        The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                        input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                        into the National Inventors Hall of Fame for his invention

                                                                        53

                                                                        Some of the industryrsquos foundational concepts and technologies for

                                                                        programmable logic arrays gates and logic blocks are founded in patents

                                                                        awarded to David W Page and LuVerne R Peterson in 1985

                                                                        In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                        proposed by Steve Casselman to develop a computer that would implement

                                                                        600000 reprogrammable gates Casselman was successful and the system was

                                                                        awarded a patent in 1992

                                                                        Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                        1990s when competitors sprouted up eroding significant market-share By

                                                                        1993 Actel was serving about 18 percent of the market

                                                                        The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                        and the volume of production In the early 1990s FPGAs were primarily used

                                                                        in telecommunications and networking By the end of the decade FPGAs

                                                                        found their way into consumer automotive and industrial applications

                                                                        FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                        algorithm technology and FPGAs to create a sound recognition device

                                                                        Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                        to decide the configuration needed to accomplish a sound recognition task

                                                                        121 Modern developments

                                                                        A recent trend has been to take the coarse-grained architectural

                                                                        approach a step further by combining the logic blocks and interconnects of

                                                                        traditional FPGAs with embedded microprocessors and related peripherals to

                                                                        form a complete system on a programmable chip This work mirrors the

                                                                        architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                        Group which combined a reconfigurable CPU architecture on a single chip

                                                                        called the SB24 That work was done in 1982 Examples of such hybrid

                                                                        technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                        which include one or more PowerPC processors embedded within the FPGAs

                                                                        54

                                                                        logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                        processor in combination with Atmels programmable logic architecture

                                                                        An alternate approach to using hard-macro processors is to make use of soft

                                                                        processor cores that are implemented within the FPGA logic (See Soft

                                                                        processors below)

                                                                        As previously mentioned many modern FPGAs have the ability to be

                                                                        reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                        computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                        suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                        example of a reconfigurable soft processor implemented on FPGAs However

                                                                        it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                        to a specific program

                                                                        Additionally new non-FPGA architectures are beginning to emerge

                                                                        Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                        hybrid approach by providing an array of processor cores and FPGA-like

                                                                        programmable cores on the same chip

                                                                        Gates

                                                                        1987 9000 gates Xilinx

                                                                        1992 600000 Naval Surface Warfare Department

                                                                        Early 2000s Millions

                                                                        Market size

                                                                        1985 First commercial FPGA technology invented by Xilinx

                                                                        1987 $14 million

                                                                        ~1993 gt$385 million

                                                                        2005 $19 billion

                                                                        2010 estimates $275 billion

                                                                        55

                                                                        122 FPGA Comparisons

                                                                        Historically FPGAs have been slower less energy efficient and generally

                                                                        achieved less functionality than their fixed ASIC counterparts A combination

                                                                        of volume fabrication improvements research and development and the IO

                                                                        capabilities of new supercomputers have largely closed the performance gap

                                                                        between ASICs and FPGAs

                                                                        Advantages include a shorter time to market ability to re-program in the field

                                                                        to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                        middle road by developing their hardware on ordinary FPGAs but

                                                                        manufacture their final version so it can no longer be modified after the design

                                                                        has been committed

                                                                        Xilinx claims that several market and technology dynamics are changing the

                                                                        ASICFPGA paradigm

                                                                        IC costs are rising aggressively

                                                                        ASIC complexity has bolstered development time and costs

                                                                        RampD resources and headcount is decreasing

                                                                        Revenue losses for slow time-to-market are increasing

                                                                        Financial constraints in a poor economy are driving low-cost technologies

                                                                        These trends make FPGAs a better alternative than ASICs for a growing

                                                                        number of higher-volume applications than they have been historically used

                                                                        for which the company blames for the growing number of FPGA design starts

                                                                        (see History)

                                                                        The primary differences between CPLDs and FPGAs are architectural A

                                                                        CPLD has a somewhat restrictive structure consisting of one or more

                                                                        programmable sum-of-products logic arrays feeding a relatively small number

                                                                        of clocked registers The result of this is less flexibility with the advantage of

                                                                        more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                        FPGA architectures on the other hand are dominated by interconnect This

                                                                        56

                                                                        makes them far more flexible (in terms of the range of designs that are

                                                                        practical for implementation within them) but also far more complex to design

                                                                        for

                                                                        Another notable difference between CPLDs and FPGAs is the presence in most

                                                                        FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                        and embedded memories as well as to have logic blocks implement decoders

                                                                        or mathematical functions

                                                                        Some FPGAs have the capability of partial re-configuration that lets one

                                                                        portion of the device be re-programmed while other portions continue running

                                                                        123 Applications

                                                                        Applications of FPGAs include digital signal processing software-defined

                                                                        radio aerospace and defense systems ASIC prototyping medical imaging

                                                                        computer vision speech recognition cryptography bioinformatics computer

                                                                        hardware emulation radio astronomy and a growing range of other areas

                                                                        FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                        space that of glue logic for PCBs As their size capabilities and speed

                                                                        increased they began to take over larger and larger functions to the state where

                                                                        some are now marketed as full systems on chips (SoC) Particularly with the

                                                                        introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                        applications which had traditionally been the sole reserve of DSPs began to

                                                                        incorporate FPGAs instead

                                                                        FPGAs especially find applications in any area or algorithm that can make use

                                                                        of the massive parallelism offered by their architecture One such area is code

                                                                        breaking in particular brute-force attack of cryptographic algorithms

                                                                        FPGAs are increasingly used in conventional high performance computing

                                                                        applications where computational kernels such as FFT or Convolution are

                                                                        performed on the FPGA instead of a microprocessor

                                                                        57

                                                                        The inherent parallelism of the logic resources on an FPGA allows for

                                                                        considerable computational throughput even at a low MHz clock rates The

                                                                        flexibility of the FPGA allows for even higher performance by trading off

                                                                        precision and range in the number format for an increased number of parallel

                                                                        arithmetic units This has driven a new type of processing called

                                                                        reconfigurable computing where time intensive tasks are offloaded from

                                                                        software to FPGAs

                                                                        The adoption of FPGAs in high performance computing is currently limited by

                                                                        the complexity of FPGA design compared to conventional software and the

                                                                        extremely long turn-around times of current design tools where 4-8 hours wait

                                                                        is necessary after even minor changes to the source code

                                                                        Traditionally FPGAs have been reserved for specific vertical applications

                                                                        where the volume of production is small For these low-volume applications

                                                                        the premium that companies pay in hardware costs per unit for a

                                                                        programmable chip is more affordable than the development resources spent

                                                                        on creating an ASIC for a low-volume application Today new cost and

                                                                        performance dynamics have broadened the range of viable applications

                                                                        Architecture

                                                                        The most common FPGA architecture consists of an array of configurable

                                                                        logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                        channels have the same width (number of wires) Multiple IO pads may fit

                                                                        into the height of one row or the width of one column in the array

                                                                        An application circuit must be mapped into an FPGA with adequate resources

                                                                        While the number of CLBs and IOs required is easily determined from the

                                                                        design the number of routing tracks needed may vary considerably even

                                                                        among designs with the same amount of logic (For example a crossbar switch

                                                                        requires much more routing than a systolic array with the same gate count)

                                                                        Since unused routing tracks increase the cost (and decrease the performance)

                                                                        of the part without providing any benefit FPGA manufacturers try to provide

                                                                        58

                                                                        just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                        can be routed This is determined by estimates such as those derived from

                                                                        Rents rule or by experiments with existing designs

                                                                        The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                        blocks and routing channels Two IO pads fit into the height of one row or the

                                                                        width of one column as shown below All the routing channels have the same

                                                                        width (number of wires)

                                                                        121 FPGA structure

                                                                        A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                        flip-flop as shown below In recent years manufacturers have started moving

                                                                        to 6-input LUTs in their high performance parts claiming increased

                                                                        performance

                                                                        Typical logic block

                                                                        59

                                                                        There is only one output which can be either the registered or the unregistered

                                                                        LUT output The logic block has four inputs for the LUT and a clock input

                                                                        Since clock signals (and often other high-fanout signals) are normally routed

                                                                        via special-purpose dedicated routing networks in commercial FPGAs they

                                                                        and other signals are separately managed

                                                                        For this example architecture the locations of the FPGA logic block pins are

                                                                        shown below

                                                                        Logic Block Pin Locations

                                                                        Each input is accessible from one side of the logic block while the output pin

                                                                        can connect to routing wires in both the channel to the right and the channel

                                                                        below the logic block

                                                                        Each logic block output pin can connect to any of the wiring segments in the

                                                                        channels adjacent to it

                                                                        Similarly an IO pad can connect to any one of the wiring segments in the

                                                                        channel adjacent to it For example an IO pad at the top of the chip can

                                                                        connect to any of the W wires (where W is the channel width) in the horizontal

                                                                        channel immediately below it

                                                                        Generally the FPGA routing is unsegmented That is each wiring segment

                                                                        spans only one logic block before it terminates in a switch box By turning on

                                                                        60

                                                                        some of the programmable switches within a switch box longer paths can be

                                                                        constructed For higher speed interconnect some FPGA architectures use

                                                                        longer routing lines that span multiple logic blocks

                                                                        Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                        this architecture when a wire enters a switch box there are three

                                                                        programmable switches that allow it to connect to three other wires in adjacent

                                                                        channel segments The pattern or topology of switches used in this

                                                                        architecture is the planar or domain-based switch box topology In this switch

                                                                        box topology a wire in track number one connects only to wires in track

                                                                        number one in adjacent channel segments wires in track number 2 connect

                                                                        only to other wires in track number 2 and so on The figure below illustrates

                                                                        the connections in a switch box

                                                                        Switch box topology

                                                                        Modern FPGA families expand upon the above capabilities to include higher

                                                                        level functionality fixed into the silicon Having these common functions

                                                                        embedded into the silicon reduces the area required and gives those functions

                                                                        increased speed compared to building them from primitives Examples of these

                                                                        include multipliers generic DSP blocks embedded processors high speed IO

                                                                        logic and embedded memories

                                                                        61

                                                                        FPGAs are also widely used for systems validation including pre-silicon

                                                                        validation post-silicon validation and firmware development This allows chip

                                                                        companies to validate their design before the chip is produced in the factory

                                                                        reducing the time to market

                                                                        124 FPGA Design and Programming

                                                                        To define the behavior of the FPGA the user provides a hardware description

                                                                        language (HDL) or a schematic design The HDL form might be easier to work

                                                                        with when handling large structures because its possible to just specify them

                                                                        numerically rather than having to draw every piece by hand On the other hand

                                                                        schematic entry can allow for easier visualisation of a design

                                                                        Then using an electronic design automation tool a technology-mapped netlist

                                                                        is generated The netlist can then be fitted to the actual FPGA architecture

                                                                        using a process called place-and-route usually performed by the FPGA

                                                                        companys proprietary place-and-route software The user will validate the

                                                                        map place and route results via timing analysis simulation and other

                                                                        verification methodologies Once the design and validation process is

                                                                        complete the binary file generated (also using the FPGA companys

                                                                        proprietary software) is used to (re)configure the FPGA

                                                                        Going from schematicHDL source files to actual configuration The source

                                                                        files are fed to a software suite from the FPGACPLD vendor that through

                                                                        different steps will produce a file This file is then transferred to the

                                                                        FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                        like an EEPROM

                                                                        The most common HDLs are VHDL and Verilog although in an attempt to

                                                                        reduce the complexity of designing in HDLs which have been compared to the

                                                                        equivalent of assembly languages there are moves to raise the abstraction level

                                                                        through the introduction of alternative languages

                                                                        62

                                                                        To simplify the design of complex systems in FPGAs there exist libraries of

                                                                        predefined complex functions and circuits that have been tested and optimized

                                                                        to speed up the design process These predefined circuits are commonly called

                                                                        IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                        (rarely free and typically released under proprietary licenses) Other

                                                                        predefined circuits are available from developer communities such as Open

                                                                        Cores (typically free and released under the GPL BSD or similar license) and

                                                                        other sources

                                                                        In a typical design flow an FPGA application developer will simulate the

                                                                        design at multiple stages throughout the design process Initially the RTL

                                                                        description in VHDL or Verilog is simulated by creating test benches to

                                                                        simulate the system and observe results Then after the synthesis engine has

                                                                        mapped the design to a netlist the netlist is translated to a gate level

                                                                        description where simulation is repeated to confirm the synthesis proceeded

                                                                        without errors Finally the design is laid out in the FPGA at which point

                                                                        propagation delays can be added and the simulation run again with these

                                                                        values back-annotated onto the netlist

                                                                        Basic Process Technology Types

                                                                        SRAM - based on static memory technology In-system programmable and re-

                                                                        programmable Requires external boot devices CMOS

                                                                        Antifuse - One-time programmable CMOS

                                                                        EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                        one-time programmable in production because of plastic packaging

                                                                        Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                        EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                        technology Can be erased even in plastic packages Some but not all

                                                                        EEPROM devices can be in-system programmed CMOS

                                                                        Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                        packages Some but not all flash devices can be in-system programmed

                                                                        63

                                                                        Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                        therefore less expensive to manufacture CMOS

                                                                        Fuse - One-time programmable Bipolar

                                                                        Major Manufacturers

                                                                        Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                        rivals Together they control over 80 percent of the market with Xilinx alone

                                                                        representing over 50 percent

                                                                        Xilinx also provides free Windows and Linux design software while Altera

                                                                        provides free Windows tools the Solaris and Linux tools are only available via

                                                                        a rental scheme

                                                                        Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                        (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                        Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                        focused CSSP no general purpose FPGAs)

                                                                        125 FPGA prototype

                                                                        FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                        prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                        hardware verification and early software development

                                                                        Main stream verification methods for hardware design and early software and

                                                                        firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                        design on FPGA has become a good method to do this

                                                                        64

                                                                        Reasons why Prototyping is important

                                                                        1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                        functionally correct This is compared to designers only relying on software

                                                                        simulations to verify that their hardware design is sound Simulation speed and

                                                                        modeling accuracy limitations hinder this development

                                                                        2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                        the foundry to start on software tests FPGA prototyping allows for much more

                                                                        time in area of software development and testing at the software-hardware

                                                                        integration stage This allows many unforeseen software bugs that appear due

                                                                        to todays array of operating systems applications and hardware

                                                                        3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                        system work well together off the simulation stage and in actual form

                                                                        4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                        bringing in interest early This speeds up the overall development cycle and

                                                                        allows for more enhancement or improvement to the chip features as it would

                                                                        otherwise have been

                                                                        65

                                                                        13 SIMULATION RESULTS

                                                                        66

                                                                        GATE LEVEL

                                                                        FigVGA CONTROLLER

                                                                        67

                                                                        GATE LEVEL

                                                                        Fig SYNC UNIT

                                                                        68

                                                                        TECHNOLOGY SCHEMATIC

                                                                        69

                                                                        14 APPLICATIONS

                                                                        Motion Tracking Systems

                                                                        Image processing systems

                                                                        Displaying Systems

                                                                        141 CONCLUTION

                                                                        Hence by implementing FPGA taking a processor-generated picture

                                                                        (pixilated) from memory space the Controller provides digital RGB values for

                                                                        each pixel as well as horizontal and vertical synchronization signals in order

                                                                        to correctly display the picture on a connected monitor

                                                                        70

                                                                        REFERENCES

                                                                        1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                        2 A VGA display controller by Eduardo Sanchez

                                                                        3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                        71

                                                                        • Pixel
                                                                          • 6 SIGNALS IN VGA CONTROLLER
                                                                          • Gates
                                                                          • Market size
                                                                          • 122 FPGA Comparisons

                                                                          Some entities can have eggate level architecture and behavioral

                                                                          architecture

                                                                          Are always optional

                                                                          PackagesPackages contain information common to many design units

                                                                          1 Package declaration

                                                                          --constant declarations

                                                                          ndashtype and subtype declarations

                                                                          ndashfunction and procedure declarations

                                                                          ndashglobal signal declarations

                                                                          ndashfile declarations

                                                                          ndashcomponent declarations

                                                                          2 Package body

                                                                          ndashis not necessary needed

                                                                          ndashfunction bodies

                                                                          ndashprocedure bodies

                                                                          Packages are meant for encapsuling data which can be shared globally among

                                                                          Several design units

                                                                          Consists of declaration part and optional body part

                                                                          Package declaration can contain

                                                                          ndashtype and subtype declarations

                                                                          ndashsubprograms

                                                                          ndashconstants

                                                                          Alias declarations

                                                                          ndashglobal signal declarations

                                                                          ndashfile declarations

                                                                          ndashcomponent declarations

                                                                          37

                                                                          Package body consists of

                                                                          ndashsubprogram declarations and bodies

                                                                          ndashtype and subtype declarations

                                                                          ndash deferred constants

                                                                          ndash file declarations

                                                                          LibrariesCollection of VHDL design units (database)

                                                                          1 Packages

                                                                          package declaration

                                                                          package body

                                                                          2 Entities (entity declaration)

                                                                          3 Architectures (architecture body)

                                                                          4 Configurations (configuration declarations)

                                                                          Usually directory in UNIX file system

                                                                          Can be also any other kind of database

                                                                          Levels of Abstraction

                                                                          VHDL supports many possible styles of design description which differ

                                                                          primarily in how closely they relate to the HW

                                                                          It is possible to describe a circuit in a number of ways

                                                                          Structural-------

                                                                          Dataflow ------- Higher level of abstraction

                                                                          Behavioral -------

                                                                          Structural VHDL description

                                                                          Circuit is described in terms of its components

                                                                          From a low-level description (eg transistor-level description) to a high level

                                                                          description (eg block diagram)

                                                                          For large circuits a low-level description quickly becomes impractical

                                                                          103 Dataflow VHDL Description

                                                                          38

                                                                          Circuit is described in terms of how data moves through the system

                                                                          In the dataflow style you describe how information flows between registers in

                                                                          the system

                                                                          The combinational logic is described at a relatively high level the placement

                                                                          and operation of registers is specified quite precisely

                                                                          The behavior of the system over the time is defined by registers

                                                                          There are no build-in registers in VHDL-language

                                                                          ndashEither lower level description

                                                                          ndashor behavioral description of sequential elements is needed

                                                                          The lower level register descriptions must be created or obtained

                                                                          If there is no 3rd party models for registers =gt you must write the behavioral

                                                                          description of registers

                                                                          The behavioral description can be provided in the form of

                                                                          subprograms(functions or procedures)

                                                                          104 Behavioral VHDL Description

                                                                          Circuit is described in terms of its operation over time

                                                                          39

                                                                          Representation might include eg state diagrams timing diagrams and

                                                                          algorithmic descriptions

                                                                          The concept of time may be expressed precisely using delays (eg A lt= B

                                                                          after 10 ns)

                                                                          If no actual delay is used order of sequential operations is defined

                                                                          In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                                          timing specifications

                                                                          The actual timing results depend on implementation technology and efficiency

                                                                          of synthesis tool

                                                                          There are a few tools for behavioral synthesis

                                                                          Concurrent Vs Sequential

                                                                          Processes

                                                                          Basic simulation concept in VHDL

                                                                          VHDL description can always be broken up to interconnected processes

                                                                          Quite similar to Unix process

                                                                          40

                                                                          Process keyword in VHDL

                                                                          Process statement is concurrent statement

                                                                          Statements inside process statements are sequential statements

                                                                          Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                          Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                          General format

                                                                          Process [(sensitivity_list)]

                                                                          process_declarative_part

                                                                          Begin

                                                                          process_statements

                                                                          [wait_statement]

                                                                          End process

                                                                          41

                                                                          11 VGA CONTROLLER CODE

                                                                          library IEEE

                                                                          use IEEESTD_LOGIC_1164ALL

                                                                          use IEEESTD_LOGIC_ARITHALL

                                                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                                                          use IEEEnumeric_stdALL

                                                                          ---- Uncomment the following library declaration if instantiating

                                                                          ---- any Xilinx primitives in this code

                                                                          --library UNISIM

                                                                          --use UNISIMVComponentsall

                                                                          entity vga_controller is port(clk50min std_logic

                                                                          rstnin std_logic

                                                                          hsyncout std_logic

                                                                          vsyncout std_logic

                                                                          redout std_logic_vector(0 downto 0)

                                                                          greenout std_logic_vector(0 downto 0)

                                                                          blueout std_logic_vector(0 downto 0))

                                                                          end vga_controller

                                                                          architecture Behavioral of vga_controller is

                                                                          component memory

                                                                          port (

                                                                          addr IN std_logic_VECTOR(17 downto 0)

                                                                          clk IN std_logic

                                                                          dout OUT std_logic_VECTOR(0 downto 0)

                                                                          en IN std_logic

                                                                          sinit IN std_logic)

                                                                          END component

                                                                          42

                                                                          component sync_unit port(

                                                                          clkin std_logic

                                                                          rstnin std_logic

                                                                          hsyncout std_logic

                                                                          ---horizontal synch pulse(row)

                                                                          vsyncout std_logic

                                                                          --vertical sync pulse(frame) 111110100000000000

                                                                          v_en_regout std_logic

                                                                          h_enout std_logic

                                                                          v_enout std_logic

                                                                          line_countinout integer

                                                                          c_hsinout integer)

                                                                          end component

                                                                          component pixelgen port(

                                                                          clkin std_logic

                                                                          rstnin std_logic

                                                                          hsyncin std_logic

                                                                          vsyncin std_logic

                                                                          h_enin std_logic

                                                                          v_enin std_logic

                                                                          v_en_reg in std_logic

                                                                          data_inin std_logic_vector(0 downto 0)

                                                                          line_countin integer

                                                                          c_hsin integer

                                                                          red out std_logic_vector(0 downto 0)

                                                                          greenout std_logic_vector(0 downto 0)

                                                                          blueout std_logic_vector(0 downto 0)

                                                                          addrout std_logic_vector(17 downto 0)

                                                                          rdout std_logic)

                                                                          end component

                                                                          43

                                                                          signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                          signal line_count_sc_hs_saddrinteger

                                                                          signal data_sstd_logic_vector(0 downto 0)

                                                                          signal addr_sstd_logic_vector(17 downto 0)

                                                                          begin

                                                                          --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                          process(clk50mrstn)

                                                                          begin

                                                                          if(rstn = 0)then

                                                                          clk lt= 0

                                                                          elsif(clk50m = 1 and clk50mevent)then

                                                                          clk lt= not clk

                                                                          end if

                                                                          --end if

                                                                          end process

                                                                          hsync lt= hsync_s

                                                                          vsync lt= vsync_s

                                                                          sync sync_unit port map(

                                                                          clk =gt clk

                                                                          rstn =gt rstn

                                                                          hsync =gt hsync_s

                                                                          vsync =gt vsync_s

                                                                          v_en_reg =gt v_en_reg_s

                                                                          line_count =gt line_count_s

                                                                          h_en =gt h_en_s

                                                                          v_en =gt v_en_s

                                                                          c_hs =gt c_hs_s

                                                                          )

                                                                          44

                                                                          pixel pixelgen port map(

                                                                          clk =gt clk

                                                                          rstn =gt rstn

                                                                          hsync =gt hsync_s

                                                                          vsync =gt vsync_s

                                                                          h_en =gt h_en_s

                                                                          v_en =gt v_en_s

                                                                          v_en_reg =gt v_en_reg_s

                                                                          data_in =gt data_s

                                                                          line_count=gt line_count_s

                                                                          c_hs =gt c_hs_s

                                                                          red =gt red

                                                                          green =gt green

                                                                          blue =gt blue

                                                                          addr =gt addr_s

                                                                          rd =gt rd_s

                                                                          )

                                                                          mem12 memory port map(

                                                                          addr =gt addr_s

                                                                          clk =gt clk

                                                                          dout =gt data_s

                                                                          en =gt rd_s

                                                                          sinit=gt rstn

                                                                          )

                                                                          end Behavioral

                                                                          45

                                                                          SYNCHRONIZATION BLOCK CODE

                                                                          library IEEE

                                                                          use IEEESTD_LOGIC_1164ALL

                                                                          use IEEESTD_LOGIC_ARITHALL

                                                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                                                          entity sync_unit is port(

                                                                          clkin std_logic

                                                                          rstnin std_logic

                                                                          hsyncout std_logic ---horizontal synch pulse(row)

                                                                          vsyncout std_logic --vertical sync

                                                                          pulse(frame) 111110100000000000

                                                                          v_en_regout std_logic

                                                                          h_enout std_logic

                                                                          v_enout std_logic

                                                                          line_countout integer

                                                                          c_hsout integer )

                                                                          end sync_unit

                                                                          architecture arch_sync_unit of sync_unit is

                                                                          signal h_en_sv_en_sstd_logic

                                                                          signal line_count_sc_hs_sinteger

                                                                          begin

                                                                          c_hs lt= c_hs_s

                                                                          line_count lt= line_count_s

                                                                          h_en lt= h_en_s

                                                                          v_en lt= v_en_s

                                                                          -- hsync counter

                                                                          process(clkrstn)

                                                                          begin

                                                                          if(rstn = 0)then

                                                                          46

                                                                          c_hs_s lt= 0

                                                                          else

                                                                          if(clk = 1 and clkevent)then

                                                                          if(c_hs_s=793)then

                                                                          c_hs_s lt= 0

                                                                          else

                                                                          c_hs_s lt= c_hs_s +1

                                                                          end if

                                                                          end if

                                                                          end if

                                                                          end process

                                                                          --vsync counter

                                                                          --vertical line counter

                                                                          process(h_en_srstn)

                                                                          begin

                                                                          if(rstn = 0)then

                                                                          line_count_s lt= 0

                                                                          elsif(h_en_s = 1 and h_en_sevent)then

                                                                          line_count_s lt= line_count_s + 1

                                                                          if (line_count_s=515)then

                                                                          line_count_s lt= 0

                                                                          end if

                                                                          end if

                                                                          --end if

                                                                          end process

                                                                          --hysnc pulse

                                                                          process(clkrstn)

                                                                          begin

                                                                          if(rstn = 0)then

                                                                          47

                                                                          hsync lt= 1

                                                                          elsif(clk = 1 and clkevent)then

                                                                          if(c_hs_s lt=95)then

                                                                          hsync lt= 0

                                                                          else

                                                                          hsync lt= 1

                                                                          end if

                                                                          end if

                                                                          --end if

                                                                          end process

                                                                          process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                          this period)

                                                                          begin

                                                                          if(rstn = 0)then

                                                                          h_en_s lt= 0

                                                                          elsif(clk = 1 and clkevent)then

                                                                          if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                          h_en_s lt= 1

                                                                          else

                                                                          h_en_s lt= 0

                                                                          end if

                                                                          end if

                                                                          --end if

                                                                          end process

                                                                          ---vysnc pulse

                                                                          process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                          begin

                                                                          if(rstn = 0)then

                                                                          48

                                                                          v_en_s lt= 0

                                                                          elsif(clk = 1 and clkevent)then

                                                                          if((line_count_sgt=35 and line_count_slt=515))then

                                                                          v_en_s lt= 1

                                                                          else

                                                                          v_en_s lt= 0

                                                                          end if

                                                                          end if

                                                                          --end if

                                                                          end process

                                                                          process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                          begin

                                                                          if(rstn = 0)then

                                                                          vsync lt= 1

                                                                          elsif(clk = 1 and clkevent)then

                                                                          if(line_count_s lt2)then

                                                                          vsync lt= 0

                                                                          else

                                                                          vsync lt= 1

                                                                          end if

                                                                          end if

                                                                          --end if

                                                                          end process

                                                                          process(clk)

                                                                          begin

                                                                          if(clk = 1 and clkevent)then

                                                                          v_en_reglt=v_en_s

                                                                          end if

                                                                          end process

                                                                          end arch_sync_unit

                                                                          49

                                                                          PIXEL GENERATOR

                                                                          library IEEE

                                                                          use IEEESTD_LOGIC_1164ALL

                                                                          use IEEESTD_LOGIC_ARITHALL

                                                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                                                          entity pixelgen is port(

                                                                          clkin std_logic

                                                                          rstnin std_logic

                                                                          hsyncin std_logic

                                                                          vsyncin std_logic

                                                                          h_enin std_logic

                                                                          v_enin std_logic

                                                                          v_en_reg in std_logic

                                                                          data_inin std_logic_vector(0 downto 0)

                                                                          line_countin integer

                                                                          c_hsin integer

                                                                          red out std_logic_vector(0 downto 0)

                                                                          greenout std_logic_vector(0 downto 0)

                                                                          blueout std_logic_vector(0 downto 0)

                                                                          addrout std_logic_vector(17 downto 0)

                                                                          rdout std_logic

                                                                          )

                                                                          end pixelgen

                                                                          architecture behv of pixelgen is

                                                                          signal addr_sstd_logic_vector(17 downto 0)

                                                                          begin

                                                                          addr lt= addr_s

                                                                          process(clkrstn)

                                                                          begin

                                                                          50

                                                                          if(rstn = 0)then

                                                                          rd lt= 0

                                                                          red lt= 0

                                                                          green lt= 0

                                                                          blue lt= 0

                                                                          addr_s lt= x0000 amp 00

                                                                          elsif (clk = 1 and clkevent)then

                                                                          if((v_en=1) and (v_en_reg=0))then

                                                                          addr_slt=(others =gt0)

                                                                          end if

                                                                          if(v_en=1 and h_en=1)then

                                                                          if(line_countlt=514)then

                                                                          if(c_hs lt= 655)then

                                                                          rd lt= 1

                                                                          red lt= data_in

                                                                          green lt= data_in

                                                                          blue lt= data_in

                                                                          addr_s lt=addr_s +000000000000000001

                                                                          else

                                                                          rd lt= 0

                                                                          red lt= 0

                                                                          green lt= 1

                                                                          blue lt= 0

                                                                          end if

                                                                          else

                                                                          rd lt= 0

                                                                          red lt= 1

                                                                          green lt= 1

                                                                          blue lt= 1

                                                                          51

                                                                          end if

                                                                          if(addr_s=111011111111111111)then

                                                                          addr_slt= (others =gt0)

                                                                          end if

                                                                          else

                                                                          red lt= 0

                                                                          green lt= 0

                                                                          blue lt= 0

                                                                          end if

                                                                          end if

                                                                          -- end if

                                                                          --end if

                                                                          end process

                                                                          end behv

                                                                          52

                                                                          12 FPGA ARCHITECTURE

                                                                          A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                          be configured by the customer or designer after manufacturingmdashhence the

                                                                          name field-programmable FPGAs are programmed using a logic circuit

                                                                          diagram or a source code in a hardware description language (HDL) to specify

                                                                          how the chip will work They can be used to implement any logical function

                                                                          that an application-specific integrated circuit (ASIC) could perform but the

                                                                          ability to update the functionality after shipping offers advantages for many

                                                                          applications

                                                                          FPGAs contain programmable logic components called logic blocks and a

                                                                          hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                          togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                          can be configured to perform complex combinational functions or merely

                                                                          simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                          include memory elements which may be simple flip-flops or more complete

                                                                          blocks of memory

                                                                          History The FPGA industry sprouted from programmable read only

                                                                          memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                          both had the option of being programmed in batches in a factory or in the field

                                                                          (field programmable) however programmable logic was hard-wired between

                                                                          logic gates

                                                                          Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                          first commercially viable field programmable gate array in 1985 ndash the

                                                                          XC2064 The XC2064 had programmable gates and programmable

                                                                          interconnects between gates the beginnings of a new technology and market

                                                                          The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                          input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                          into the National Inventors Hall of Fame for his invention

                                                                          53

                                                                          Some of the industryrsquos foundational concepts and technologies for

                                                                          programmable logic arrays gates and logic blocks are founded in patents

                                                                          awarded to David W Page and LuVerne R Peterson in 1985

                                                                          In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                          proposed by Steve Casselman to develop a computer that would implement

                                                                          600000 reprogrammable gates Casselman was successful and the system was

                                                                          awarded a patent in 1992

                                                                          Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                          1990s when competitors sprouted up eroding significant market-share By

                                                                          1993 Actel was serving about 18 percent of the market

                                                                          The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                          and the volume of production In the early 1990s FPGAs were primarily used

                                                                          in telecommunications and networking By the end of the decade FPGAs

                                                                          found their way into consumer automotive and industrial applications

                                                                          FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                          algorithm technology and FPGAs to create a sound recognition device

                                                                          Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                          to decide the configuration needed to accomplish a sound recognition task

                                                                          121 Modern developments

                                                                          A recent trend has been to take the coarse-grained architectural

                                                                          approach a step further by combining the logic blocks and interconnects of

                                                                          traditional FPGAs with embedded microprocessors and related peripherals to

                                                                          form a complete system on a programmable chip This work mirrors the

                                                                          architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                          Group which combined a reconfigurable CPU architecture on a single chip

                                                                          called the SB24 That work was done in 1982 Examples of such hybrid

                                                                          technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                          which include one or more PowerPC processors embedded within the FPGAs

                                                                          54

                                                                          logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                          processor in combination with Atmels programmable logic architecture

                                                                          An alternate approach to using hard-macro processors is to make use of soft

                                                                          processor cores that are implemented within the FPGA logic (See Soft

                                                                          processors below)

                                                                          As previously mentioned many modern FPGAs have the ability to be

                                                                          reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                          computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                          suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                          example of a reconfigurable soft processor implemented on FPGAs However

                                                                          it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                          to a specific program

                                                                          Additionally new non-FPGA architectures are beginning to emerge

                                                                          Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                          hybrid approach by providing an array of processor cores and FPGA-like

                                                                          programmable cores on the same chip

                                                                          Gates

                                                                          1987 9000 gates Xilinx

                                                                          1992 600000 Naval Surface Warfare Department

                                                                          Early 2000s Millions

                                                                          Market size

                                                                          1985 First commercial FPGA technology invented by Xilinx

                                                                          1987 $14 million

                                                                          ~1993 gt$385 million

                                                                          2005 $19 billion

                                                                          2010 estimates $275 billion

                                                                          55

                                                                          122 FPGA Comparisons

                                                                          Historically FPGAs have been slower less energy efficient and generally

                                                                          achieved less functionality than their fixed ASIC counterparts A combination

                                                                          of volume fabrication improvements research and development and the IO

                                                                          capabilities of new supercomputers have largely closed the performance gap

                                                                          between ASICs and FPGAs

                                                                          Advantages include a shorter time to market ability to re-program in the field

                                                                          to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                          middle road by developing their hardware on ordinary FPGAs but

                                                                          manufacture their final version so it can no longer be modified after the design

                                                                          has been committed

                                                                          Xilinx claims that several market and technology dynamics are changing the

                                                                          ASICFPGA paradigm

                                                                          IC costs are rising aggressively

                                                                          ASIC complexity has bolstered development time and costs

                                                                          RampD resources and headcount is decreasing

                                                                          Revenue losses for slow time-to-market are increasing

                                                                          Financial constraints in a poor economy are driving low-cost technologies

                                                                          These trends make FPGAs a better alternative than ASICs for a growing

                                                                          number of higher-volume applications than they have been historically used

                                                                          for which the company blames for the growing number of FPGA design starts

                                                                          (see History)

                                                                          The primary differences between CPLDs and FPGAs are architectural A

                                                                          CPLD has a somewhat restrictive structure consisting of one or more

                                                                          programmable sum-of-products logic arrays feeding a relatively small number

                                                                          of clocked registers The result of this is less flexibility with the advantage of

                                                                          more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                          FPGA architectures on the other hand are dominated by interconnect This

                                                                          56

                                                                          makes them far more flexible (in terms of the range of designs that are

                                                                          practical for implementation within them) but also far more complex to design

                                                                          for

                                                                          Another notable difference between CPLDs and FPGAs is the presence in most

                                                                          FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                          and embedded memories as well as to have logic blocks implement decoders

                                                                          or mathematical functions

                                                                          Some FPGAs have the capability of partial re-configuration that lets one

                                                                          portion of the device be re-programmed while other portions continue running

                                                                          123 Applications

                                                                          Applications of FPGAs include digital signal processing software-defined

                                                                          radio aerospace and defense systems ASIC prototyping medical imaging

                                                                          computer vision speech recognition cryptography bioinformatics computer

                                                                          hardware emulation radio astronomy and a growing range of other areas

                                                                          FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                          space that of glue logic for PCBs As their size capabilities and speed

                                                                          increased they began to take over larger and larger functions to the state where

                                                                          some are now marketed as full systems on chips (SoC) Particularly with the

                                                                          introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                          applications which had traditionally been the sole reserve of DSPs began to

                                                                          incorporate FPGAs instead

                                                                          FPGAs especially find applications in any area or algorithm that can make use

                                                                          of the massive parallelism offered by their architecture One such area is code

                                                                          breaking in particular brute-force attack of cryptographic algorithms

                                                                          FPGAs are increasingly used in conventional high performance computing

                                                                          applications where computational kernels such as FFT or Convolution are

                                                                          performed on the FPGA instead of a microprocessor

                                                                          57

                                                                          The inherent parallelism of the logic resources on an FPGA allows for

                                                                          considerable computational throughput even at a low MHz clock rates The

                                                                          flexibility of the FPGA allows for even higher performance by trading off

                                                                          precision and range in the number format for an increased number of parallel

                                                                          arithmetic units This has driven a new type of processing called

                                                                          reconfigurable computing where time intensive tasks are offloaded from

                                                                          software to FPGAs

                                                                          The adoption of FPGAs in high performance computing is currently limited by

                                                                          the complexity of FPGA design compared to conventional software and the

                                                                          extremely long turn-around times of current design tools where 4-8 hours wait

                                                                          is necessary after even minor changes to the source code

                                                                          Traditionally FPGAs have been reserved for specific vertical applications

                                                                          where the volume of production is small For these low-volume applications

                                                                          the premium that companies pay in hardware costs per unit for a

                                                                          programmable chip is more affordable than the development resources spent

                                                                          on creating an ASIC for a low-volume application Today new cost and

                                                                          performance dynamics have broadened the range of viable applications

                                                                          Architecture

                                                                          The most common FPGA architecture consists of an array of configurable

                                                                          logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                          channels have the same width (number of wires) Multiple IO pads may fit

                                                                          into the height of one row or the width of one column in the array

                                                                          An application circuit must be mapped into an FPGA with adequate resources

                                                                          While the number of CLBs and IOs required is easily determined from the

                                                                          design the number of routing tracks needed may vary considerably even

                                                                          among designs with the same amount of logic (For example a crossbar switch

                                                                          requires much more routing than a systolic array with the same gate count)

                                                                          Since unused routing tracks increase the cost (and decrease the performance)

                                                                          of the part without providing any benefit FPGA manufacturers try to provide

                                                                          58

                                                                          just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                          can be routed This is determined by estimates such as those derived from

                                                                          Rents rule or by experiments with existing designs

                                                                          The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                          blocks and routing channels Two IO pads fit into the height of one row or the

                                                                          width of one column as shown below All the routing channels have the same

                                                                          width (number of wires)

                                                                          121 FPGA structure

                                                                          A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                          flip-flop as shown below In recent years manufacturers have started moving

                                                                          to 6-input LUTs in their high performance parts claiming increased

                                                                          performance

                                                                          Typical logic block

                                                                          59

                                                                          There is only one output which can be either the registered or the unregistered

                                                                          LUT output The logic block has four inputs for the LUT and a clock input

                                                                          Since clock signals (and often other high-fanout signals) are normally routed

                                                                          via special-purpose dedicated routing networks in commercial FPGAs they

                                                                          and other signals are separately managed

                                                                          For this example architecture the locations of the FPGA logic block pins are

                                                                          shown below

                                                                          Logic Block Pin Locations

                                                                          Each input is accessible from one side of the logic block while the output pin

                                                                          can connect to routing wires in both the channel to the right and the channel

                                                                          below the logic block

                                                                          Each logic block output pin can connect to any of the wiring segments in the

                                                                          channels adjacent to it

                                                                          Similarly an IO pad can connect to any one of the wiring segments in the

                                                                          channel adjacent to it For example an IO pad at the top of the chip can

                                                                          connect to any of the W wires (where W is the channel width) in the horizontal

                                                                          channel immediately below it

                                                                          Generally the FPGA routing is unsegmented That is each wiring segment

                                                                          spans only one logic block before it terminates in a switch box By turning on

                                                                          60

                                                                          some of the programmable switches within a switch box longer paths can be

                                                                          constructed For higher speed interconnect some FPGA architectures use

                                                                          longer routing lines that span multiple logic blocks

                                                                          Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                          this architecture when a wire enters a switch box there are three

                                                                          programmable switches that allow it to connect to three other wires in adjacent

                                                                          channel segments The pattern or topology of switches used in this

                                                                          architecture is the planar or domain-based switch box topology In this switch

                                                                          box topology a wire in track number one connects only to wires in track

                                                                          number one in adjacent channel segments wires in track number 2 connect

                                                                          only to other wires in track number 2 and so on The figure below illustrates

                                                                          the connections in a switch box

                                                                          Switch box topology

                                                                          Modern FPGA families expand upon the above capabilities to include higher

                                                                          level functionality fixed into the silicon Having these common functions

                                                                          embedded into the silicon reduces the area required and gives those functions

                                                                          increased speed compared to building them from primitives Examples of these

                                                                          include multipliers generic DSP blocks embedded processors high speed IO

                                                                          logic and embedded memories

                                                                          61

                                                                          FPGAs are also widely used for systems validation including pre-silicon

                                                                          validation post-silicon validation and firmware development This allows chip

                                                                          companies to validate their design before the chip is produced in the factory

                                                                          reducing the time to market

                                                                          124 FPGA Design and Programming

                                                                          To define the behavior of the FPGA the user provides a hardware description

                                                                          language (HDL) or a schematic design The HDL form might be easier to work

                                                                          with when handling large structures because its possible to just specify them

                                                                          numerically rather than having to draw every piece by hand On the other hand

                                                                          schematic entry can allow for easier visualisation of a design

                                                                          Then using an electronic design automation tool a technology-mapped netlist

                                                                          is generated The netlist can then be fitted to the actual FPGA architecture

                                                                          using a process called place-and-route usually performed by the FPGA

                                                                          companys proprietary place-and-route software The user will validate the

                                                                          map place and route results via timing analysis simulation and other

                                                                          verification methodologies Once the design and validation process is

                                                                          complete the binary file generated (also using the FPGA companys

                                                                          proprietary software) is used to (re)configure the FPGA

                                                                          Going from schematicHDL source files to actual configuration The source

                                                                          files are fed to a software suite from the FPGACPLD vendor that through

                                                                          different steps will produce a file This file is then transferred to the

                                                                          FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                          like an EEPROM

                                                                          The most common HDLs are VHDL and Verilog although in an attempt to

                                                                          reduce the complexity of designing in HDLs which have been compared to the

                                                                          equivalent of assembly languages there are moves to raise the abstraction level

                                                                          through the introduction of alternative languages

                                                                          62

                                                                          To simplify the design of complex systems in FPGAs there exist libraries of

                                                                          predefined complex functions and circuits that have been tested and optimized

                                                                          to speed up the design process These predefined circuits are commonly called

                                                                          IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                          (rarely free and typically released under proprietary licenses) Other

                                                                          predefined circuits are available from developer communities such as Open

                                                                          Cores (typically free and released under the GPL BSD or similar license) and

                                                                          other sources

                                                                          In a typical design flow an FPGA application developer will simulate the

                                                                          design at multiple stages throughout the design process Initially the RTL

                                                                          description in VHDL or Verilog is simulated by creating test benches to

                                                                          simulate the system and observe results Then after the synthesis engine has

                                                                          mapped the design to a netlist the netlist is translated to a gate level

                                                                          description where simulation is repeated to confirm the synthesis proceeded

                                                                          without errors Finally the design is laid out in the FPGA at which point

                                                                          propagation delays can be added and the simulation run again with these

                                                                          values back-annotated onto the netlist

                                                                          Basic Process Technology Types

                                                                          SRAM - based on static memory technology In-system programmable and re-

                                                                          programmable Requires external boot devices CMOS

                                                                          Antifuse - One-time programmable CMOS

                                                                          EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                          one-time programmable in production because of plastic packaging

                                                                          Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                          EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                          technology Can be erased even in plastic packages Some but not all

                                                                          EEPROM devices can be in-system programmed CMOS

                                                                          Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                          packages Some but not all flash devices can be in-system programmed

                                                                          63

                                                                          Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                          therefore less expensive to manufacture CMOS

                                                                          Fuse - One-time programmable Bipolar

                                                                          Major Manufacturers

                                                                          Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                          rivals Together they control over 80 percent of the market with Xilinx alone

                                                                          representing over 50 percent

                                                                          Xilinx also provides free Windows and Linux design software while Altera

                                                                          provides free Windows tools the Solaris and Linux tools are only available via

                                                                          a rental scheme

                                                                          Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                          (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                          Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                          focused CSSP no general purpose FPGAs)

                                                                          125 FPGA prototype

                                                                          FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                          prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                          hardware verification and early software development

                                                                          Main stream verification methods for hardware design and early software and

                                                                          firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                          design on FPGA has become a good method to do this

                                                                          64

                                                                          Reasons why Prototyping is important

                                                                          1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                          functionally correct This is compared to designers only relying on software

                                                                          simulations to verify that their hardware design is sound Simulation speed and

                                                                          modeling accuracy limitations hinder this development

                                                                          2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                          the foundry to start on software tests FPGA prototyping allows for much more

                                                                          time in area of software development and testing at the software-hardware

                                                                          integration stage This allows many unforeseen software bugs that appear due

                                                                          to todays array of operating systems applications and hardware

                                                                          3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                          system work well together off the simulation stage and in actual form

                                                                          4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                          bringing in interest early This speeds up the overall development cycle and

                                                                          allows for more enhancement or improvement to the chip features as it would

                                                                          otherwise have been

                                                                          65

                                                                          13 SIMULATION RESULTS

                                                                          66

                                                                          GATE LEVEL

                                                                          FigVGA CONTROLLER

                                                                          67

                                                                          GATE LEVEL

                                                                          Fig SYNC UNIT

                                                                          68

                                                                          TECHNOLOGY SCHEMATIC

                                                                          69

                                                                          14 APPLICATIONS

                                                                          Motion Tracking Systems

                                                                          Image processing systems

                                                                          Displaying Systems

                                                                          141 CONCLUTION

                                                                          Hence by implementing FPGA taking a processor-generated picture

                                                                          (pixilated) from memory space the Controller provides digital RGB values for

                                                                          each pixel as well as horizontal and vertical synchronization signals in order

                                                                          to correctly display the picture on a connected monitor

                                                                          70

                                                                          REFERENCES

                                                                          1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                          2 A VGA display controller by Eduardo Sanchez

                                                                          3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                          71

                                                                          • Pixel
                                                                            • 6 SIGNALS IN VGA CONTROLLER
                                                                            • Gates
                                                                            • Market size
                                                                            • 122 FPGA Comparisons

                                                                            Package body consists of

                                                                            ndashsubprogram declarations and bodies

                                                                            ndashtype and subtype declarations

                                                                            ndash deferred constants

                                                                            ndash file declarations

                                                                            LibrariesCollection of VHDL design units (database)

                                                                            1 Packages

                                                                            package declaration

                                                                            package body

                                                                            2 Entities (entity declaration)

                                                                            3 Architectures (architecture body)

                                                                            4 Configurations (configuration declarations)

                                                                            Usually directory in UNIX file system

                                                                            Can be also any other kind of database

                                                                            Levels of Abstraction

                                                                            VHDL supports many possible styles of design description which differ

                                                                            primarily in how closely they relate to the HW

                                                                            It is possible to describe a circuit in a number of ways

                                                                            Structural-------

                                                                            Dataflow ------- Higher level of abstraction

                                                                            Behavioral -------

                                                                            Structural VHDL description

                                                                            Circuit is described in terms of its components

                                                                            From a low-level description (eg transistor-level description) to a high level

                                                                            description (eg block diagram)

                                                                            For large circuits a low-level description quickly becomes impractical

                                                                            103 Dataflow VHDL Description

                                                                            38

                                                                            Circuit is described in terms of how data moves through the system

                                                                            In the dataflow style you describe how information flows between registers in

                                                                            the system

                                                                            The combinational logic is described at a relatively high level the placement

                                                                            and operation of registers is specified quite precisely

                                                                            The behavior of the system over the time is defined by registers

                                                                            There are no build-in registers in VHDL-language

                                                                            ndashEither lower level description

                                                                            ndashor behavioral description of sequential elements is needed

                                                                            The lower level register descriptions must be created or obtained

                                                                            If there is no 3rd party models for registers =gt you must write the behavioral

                                                                            description of registers

                                                                            The behavioral description can be provided in the form of

                                                                            subprograms(functions or procedures)

                                                                            104 Behavioral VHDL Description

                                                                            Circuit is described in terms of its operation over time

                                                                            39

                                                                            Representation might include eg state diagrams timing diagrams and

                                                                            algorithmic descriptions

                                                                            The concept of time may be expressed precisely using delays (eg A lt= B

                                                                            after 10 ns)

                                                                            If no actual delay is used order of sequential operations is defined

                                                                            In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                                            timing specifications

                                                                            The actual timing results depend on implementation technology and efficiency

                                                                            of synthesis tool

                                                                            There are a few tools for behavioral synthesis

                                                                            Concurrent Vs Sequential

                                                                            Processes

                                                                            Basic simulation concept in VHDL

                                                                            VHDL description can always be broken up to interconnected processes

                                                                            Quite similar to Unix process

                                                                            40

                                                                            Process keyword in VHDL

                                                                            Process statement is concurrent statement

                                                                            Statements inside process statements are sequential statements

                                                                            Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                            Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                            General format

                                                                            Process [(sensitivity_list)]

                                                                            process_declarative_part

                                                                            Begin

                                                                            process_statements

                                                                            [wait_statement]

                                                                            End process

                                                                            41

                                                                            11 VGA CONTROLLER CODE

                                                                            library IEEE

                                                                            use IEEESTD_LOGIC_1164ALL

                                                                            use IEEESTD_LOGIC_ARITHALL

                                                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                                                            use IEEEnumeric_stdALL

                                                                            ---- Uncomment the following library declaration if instantiating

                                                                            ---- any Xilinx primitives in this code

                                                                            --library UNISIM

                                                                            --use UNISIMVComponentsall

                                                                            entity vga_controller is port(clk50min std_logic

                                                                            rstnin std_logic

                                                                            hsyncout std_logic

                                                                            vsyncout std_logic

                                                                            redout std_logic_vector(0 downto 0)

                                                                            greenout std_logic_vector(0 downto 0)

                                                                            blueout std_logic_vector(0 downto 0))

                                                                            end vga_controller

                                                                            architecture Behavioral of vga_controller is

                                                                            component memory

                                                                            port (

                                                                            addr IN std_logic_VECTOR(17 downto 0)

                                                                            clk IN std_logic

                                                                            dout OUT std_logic_VECTOR(0 downto 0)

                                                                            en IN std_logic

                                                                            sinit IN std_logic)

                                                                            END component

                                                                            42

                                                                            component sync_unit port(

                                                                            clkin std_logic

                                                                            rstnin std_logic

                                                                            hsyncout std_logic

                                                                            ---horizontal synch pulse(row)

                                                                            vsyncout std_logic

                                                                            --vertical sync pulse(frame) 111110100000000000

                                                                            v_en_regout std_logic

                                                                            h_enout std_logic

                                                                            v_enout std_logic

                                                                            line_countinout integer

                                                                            c_hsinout integer)

                                                                            end component

                                                                            component pixelgen port(

                                                                            clkin std_logic

                                                                            rstnin std_logic

                                                                            hsyncin std_logic

                                                                            vsyncin std_logic

                                                                            h_enin std_logic

                                                                            v_enin std_logic

                                                                            v_en_reg in std_logic

                                                                            data_inin std_logic_vector(0 downto 0)

                                                                            line_countin integer

                                                                            c_hsin integer

                                                                            red out std_logic_vector(0 downto 0)

                                                                            greenout std_logic_vector(0 downto 0)

                                                                            blueout std_logic_vector(0 downto 0)

                                                                            addrout std_logic_vector(17 downto 0)

                                                                            rdout std_logic)

                                                                            end component

                                                                            43

                                                                            signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                            signal line_count_sc_hs_saddrinteger

                                                                            signal data_sstd_logic_vector(0 downto 0)

                                                                            signal addr_sstd_logic_vector(17 downto 0)

                                                                            begin

                                                                            --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                            process(clk50mrstn)

                                                                            begin

                                                                            if(rstn = 0)then

                                                                            clk lt= 0

                                                                            elsif(clk50m = 1 and clk50mevent)then

                                                                            clk lt= not clk

                                                                            end if

                                                                            --end if

                                                                            end process

                                                                            hsync lt= hsync_s

                                                                            vsync lt= vsync_s

                                                                            sync sync_unit port map(

                                                                            clk =gt clk

                                                                            rstn =gt rstn

                                                                            hsync =gt hsync_s

                                                                            vsync =gt vsync_s

                                                                            v_en_reg =gt v_en_reg_s

                                                                            line_count =gt line_count_s

                                                                            h_en =gt h_en_s

                                                                            v_en =gt v_en_s

                                                                            c_hs =gt c_hs_s

                                                                            )

                                                                            44

                                                                            pixel pixelgen port map(

                                                                            clk =gt clk

                                                                            rstn =gt rstn

                                                                            hsync =gt hsync_s

                                                                            vsync =gt vsync_s

                                                                            h_en =gt h_en_s

                                                                            v_en =gt v_en_s

                                                                            v_en_reg =gt v_en_reg_s

                                                                            data_in =gt data_s

                                                                            line_count=gt line_count_s

                                                                            c_hs =gt c_hs_s

                                                                            red =gt red

                                                                            green =gt green

                                                                            blue =gt blue

                                                                            addr =gt addr_s

                                                                            rd =gt rd_s

                                                                            )

                                                                            mem12 memory port map(

                                                                            addr =gt addr_s

                                                                            clk =gt clk

                                                                            dout =gt data_s

                                                                            en =gt rd_s

                                                                            sinit=gt rstn

                                                                            )

                                                                            end Behavioral

                                                                            45

                                                                            SYNCHRONIZATION BLOCK CODE

                                                                            library IEEE

                                                                            use IEEESTD_LOGIC_1164ALL

                                                                            use IEEESTD_LOGIC_ARITHALL

                                                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                                                            entity sync_unit is port(

                                                                            clkin std_logic

                                                                            rstnin std_logic

                                                                            hsyncout std_logic ---horizontal synch pulse(row)

                                                                            vsyncout std_logic --vertical sync

                                                                            pulse(frame) 111110100000000000

                                                                            v_en_regout std_logic

                                                                            h_enout std_logic

                                                                            v_enout std_logic

                                                                            line_countout integer

                                                                            c_hsout integer )

                                                                            end sync_unit

                                                                            architecture arch_sync_unit of sync_unit is

                                                                            signal h_en_sv_en_sstd_logic

                                                                            signal line_count_sc_hs_sinteger

                                                                            begin

                                                                            c_hs lt= c_hs_s

                                                                            line_count lt= line_count_s

                                                                            h_en lt= h_en_s

                                                                            v_en lt= v_en_s

                                                                            -- hsync counter

                                                                            process(clkrstn)

                                                                            begin

                                                                            if(rstn = 0)then

                                                                            46

                                                                            c_hs_s lt= 0

                                                                            else

                                                                            if(clk = 1 and clkevent)then

                                                                            if(c_hs_s=793)then

                                                                            c_hs_s lt= 0

                                                                            else

                                                                            c_hs_s lt= c_hs_s +1

                                                                            end if

                                                                            end if

                                                                            end if

                                                                            end process

                                                                            --vsync counter

                                                                            --vertical line counter

                                                                            process(h_en_srstn)

                                                                            begin

                                                                            if(rstn = 0)then

                                                                            line_count_s lt= 0

                                                                            elsif(h_en_s = 1 and h_en_sevent)then

                                                                            line_count_s lt= line_count_s + 1

                                                                            if (line_count_s=515)then

                                                                            line_count_s lt= 0

                                                                            end if

                                                                            end if

                                                                            --end if

                                                                            end process

                                                                            --hysnc pulse

                                                                            process(clkrstn)

                                                                            begin

                                                                            if(rstn = 0)then

                                                                            47

                                                                            hsync lt= 1

                                                                            elsif(clk = 1 and clkevent)then

                                                                            if(c_hs_s lt=95)then

                                                                            hsync lt= 0

                                                                            else

                                                                            hsync lt= 1

                                                                            end if

                                                                            end if

                                                                            --end if

                                                                            end process

                                                                            process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                            this period)

                                                                            begin

                                                                            if(rstn = 0)then

                                                                            h_en_s lt= 0

                                                                            elsif(clk = 1 and clkevent)then

                                                                            if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                            h_en_s lt= 1

                                                                            else

                                                                            h_en_s lt= 0

                                                                            end if

                                                                            end if

                                                                            --end if

                                                                            end process

                                                                            ---vysnc pulse

                                                                            process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                            begin

                                                                            if(rstn = 0)then

                                                                            48

                                                                            v_en_s lt= 0

                                                                            elsif(clk = 1 and clkevent)then

                                                                            if((line_count_sgt=35 and line_count_slt=515))then

                                                                            v_en_s lt= 1

                                                                            else

                                                                            v_en_s lt= 0

                                                                            end if

                                                                            end if

                                                                            --end if

                                                                            end process

                                                                            process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                            begin

                                                                            if(rstn = 0)then

                                                                            vsync lt= 1

                                                                            elsif(clk = 1 and clkevent)then

                                                                            if(line_count_s lt2)then

                                                                            vsync lt= 0

                                                                            else

                                                                            vsync lt= 1

                                                                            end if

                                                                            end if

                                                                            --end if

                                                                            end process

                                                                            process(clk)

                                                                            begin

                                                                            if(clk = 1 and clkevent)then

                                                                            v_en_reglt=v_en_s

                                                                            end if

                                                                            end process

                                                                            end arch_sync_unit

                                                                            49

                                                                            PIXEL GENERATOR

                                                                            library IEEE

                                                                            use IEEESTD_LOGIC_1164ALL

                                                                            use IEEESTD_LOGIC_ARITHALL

                                                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                                                            entity pixelgen is port(

                                                                            clkin std_logic

                                                                            rstnin std_logic

                                                                            hsyncin std_logic

                                                                            vsyncin std_logic

                                                                            h_enin std_logic

                                                                            v_enin std_logic

                                                                            v_en_reg in std_logic

                                                                            data_inin std_logic_vector(0 downto 0)

                                                                            line_countin integer

                                                                            c_hsin integer

                                                                            red out std_logic_vector(0 downto 0)

                                                                            greenout std_logic_vector(0 downto 0)

                                                                            blueout std_logic_vector(0 downto 0)

                                                                            addrout std_logic_vector(17 downto 0)

                                                                            rdout std_logic

                                                                            )

                                                                            end pixelgen

                                                                            architecture behv of pixelgen is

                                                                            signal addr_sstd_logic_vector(17 downto 0)

                                                                            begin

                                                                            addr lt= addr_s

                                                                            process(clkrstn)

                                                                            begin

                                                                            50

                                                                            if(rstn = 0)then

                                                                            rd lt= 0

                                                                            red lt= 0

                                                                            green lt= 0

                                                                            blue lt= 0

                                                                            addr_s lt= x0000 amp 00

                                                                            elsif (clk = 1 and clkevent)then

                                                                            if((v_en=1) and (v_en_reg=0))then

                                                                            addr_slt=(others =gt0)

                                                                            end if

                                                                            if(v_en=1 and h_en=1)then

                                                                            if(line_countlt=514)then

                                                                            if(c_hs lt= 655)then

                                                                            rd lt= 1

                                                                            red lt= data_in

                                                                            green lt= data_in

                                                                            blue lt= data_in

                                                                            addr_s lt=addr_s +000000000000000001

                                                                            else

                                                                            rd lt= 0

                                                                            red lt= 0

                                                                            green lt= 1

                                                                            blue lt= 0

                                                                            end if

                                                                            else

                                                                            rd lt= 0

                                                                            red lt= 1

                                                                            green lt= 1

                                                                            blue lt= 1

                                                                            51

                                                                            end if

                                                                            if(addr_s=111011111111111111)then

                                                                            addr_slt= (others =gt0)

                                                                            end if

                                                                            else

                                                                            red lt= 0

                                                                            green lt= 0

                                                                            blue lt= 0

                                                                            end if

                                                                            end if

                                                                            -- end if

                                                                            --end if

                                                                            end process

                                                                            end behv

                                                                            52

                                                                            12 FPGA ARCHITECTURE

                                                                            A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                            be configured by the customer or designer after manufacturingmdashhence the

                                                                            name field-programmable FPGAs are programmed using a logic circuit

                                                                            diagram or a source code in a hardware description language (HDL) to specify

                                                                            how the chip will work They can be used to implement any logical function

                                                                            that an application-specific integrated circuit (ASIC) could perform but the

                                                                            ability to update the functionality after shipping offers advantages for many

                                                                            applications

                                                                            FPGAs contain programmable logic components called logic blocks and a

                                                                            hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                            togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                            can be configured to perform complex combinational functions or merely

                                                                            simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                            include memory elements which may be simple flip-flops or more complete

                                                                            blocks of memory

                                                                            History The FPGA industry sprouted from programmable read only

                                                                            memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                            both had the option of being programmed in batches in a factory or in the field

                                                                            (field programmable) however programmable logic was hard-wired between

                                                                            logic gates

                                                                            Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                            first commercially viable field programmable gate array in 1985 ndash the

                                                                            XC2064 The XC2064 had programmable gates and programmable

                                                                            interconnects between gates the beginnings of a new technology and market

                                                                            The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                            input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                            into the National Inventors Hall of Fame for his invention

                                                                            53

                                                                            Some of the industryrsquos foundational concepts and technologies for

                                                                            programmable logic arrays gates and logic blocks are founded in patents

                                                                            awarded to David W Page and LuVerne R Peterson in 1985

                                                                            In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                            proposed by Steve Casselman to develop a computer that would implement

                                                                            600000 reprogrammable gates Casselman was successful and the system was

                                                                            awarded a patent in 1992

                                                                            Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                            1990s when competitors sprouted up eroding significant market-share By

                                                                            1993 Actel was serving about 18 percent of the market

                                                                            The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                            and the volume of production In the early 1990s FPGAs were primarily used

                                                                            in telecommunications and networking By the end of the decade FPGAs

                                                                            found their way into consumer automotive and industrial applications

                                                                            FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                            algorithm technology and FPGAs to create a sound recognition device

                                                                            Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                            to decide the configuration needed to accomplish a sound recognition task

                                                                            121 Modern developments

                                                                            A recent trend has been to take the coarse-grained architectural

                                                                            approach a step further by combining the logic blocks and interconnects of

                                                                            traditional FPGAs with embedded microprocessors and related peripherals to

                                                                            form a complete system on a programmable chip This work mirrors the

                                                                            architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                            Group which combined a reconfigurable CPU architecture on a single chip

                                                                            called the SB24 That work was done in 1982 Examples of such hybrid

                                                                            technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                            which include one or more PowerPC processors embedded within the FPGAs

                                                                            54

                                                                            logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                            processor in combination with Atmels programmable logic architecture

                                                                            An alternate approach to using hard-macro processors is to make use of soft

                                                                            processor cores that are implemented within the FPGA logic (See Soft

                                                                            processors below)

                                                                            As previously mentioned many modern FPGAs have the ability to be

                                                                            reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                            computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                            suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                            example of a reconfigurable soft processor implemented on FPGAs However

                                                                            it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                            to a specific program

                                                                            Additionally new non-FPGA architectures are beginning to emerge

                                                                            Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                            hybrid approach by providing an array of processor cores and FPGA-like

                                                                            programmable cores on the same chip

                                                                            Gates

                                                                            1987 9000 gates Xilinx

                                                                            1992 600000 Naval Surface Warfare Department

                                                                            Early 2000s Millions

                                                                            Market size

                                                                            1985 First commercial FPGA technology invented by Xilinx

                                                                            1987 $14 million

                                                                            ~1993 gt$385 million

                                                                            2005 $19 billion

                                                                            2010 estimates $275 billion

                                                                            55

                                                                            122 FPGA Comparisons

                                                                            Historically FPGAs have been slower less energy efficient and generally

                                                                            achieved less functionality than their fixed ASIC counterparts A combination

                                                                            of volume fabrication improvements research and development and the IO

                                                                            capabilities of new supercomputers have largely closed the performance gap

                                                                            between ASICs and FPGAs

                                                                            Advantages include a shorter time to market ability to re-program in the field

                                                                            to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                            middle road by developing their hardware on ordinary FPGAs but

                                                                            manufacture their final version so it can no longer be modified after the design

                                                                            has been committed

                                                                            Xilinx claims that several market and technology dynamics are changing the

                                                                            ASICFPGA paradigm

                                                                            IC costs are rising aggressively

                                                                            ASIC complexity has bolstered development time and costs

                                                                            RampD resources and headcount is decreasing

                                                                            Revenue losses for slow time-to-market are increasing

                                                                            Financial constraints in a poor economy are driving low-cost technologies

                                                                            These trends make FPGAs a better alternative than ASICs for a growing

                                                                            number of higher-volume applications than they have been historically used

                                                                            for which the company blames for the growing number of FPGA design starts

                                                                            (see History)

                                                                            The primary differences between CPLDs and FPGAs are architectural A

                                                                            CPLD has a somewhat restrictive structure consisting of one or more

                                                                            programmable sum-of-products logic arrays feeding a relatively small number

                                                                            of clocked registers The result of this is less flexibility with the advantage of

                                                                            more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                            FPGA architectures on the other hand are dominated by interconnect This

                                                                            56

                                                                            makes them far more flexible (in terms of the range of designs that are

                                                                            practical for implementation within them) but also far more complex to design

                                                                            for

                                                                            Another notable difference between CPLDs and FPGAs is the presence in most

                                                                            FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                            and embedded memories as well as to have logic blocks implement decoders

                                                                            or mathematical functions

                                                                            Some FPGAs have the capability of partial re-configuration that lets one

                                                                            portion of the device be re-programmed while other portions continue running

                                                                            123 Applications

                                                                            Applications of FPGAs include digital signal processing software-defined

                                                                            radio aerospace and defense systems ASIC prototyping medical imaging

                                                                            computer vision speech recognition cryptography bioinformatics computer

                                                                            hardware emulation radio astronomy and a growing range of other areas

                                                                            FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                            space that of glue logic for PCBs As their size capabilities and speed

                                                                            increased they began to take over larger and larger functions to the state where

                                                                            some are now marketed as full systems on chips (SoC) Particularly with the

                                                                            introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                            applications which had traditionally been the sole reserve of DSPs began to

                                                                            incorporate FPGAs instead

                                                                            FPGAs especially find applications in any area or algorithm that can make use

                                                                            of the massive parallelism offered by their architecture One such area is code

                                                                            breaking in particular brute-force attack of cryptographic algorithms

                                                                            FPGAs are increasingly used in conventional high performance computing

                                                                            applications where computational kernels such as FFT or Convolution are

                                                                            performed on the FPGA instead of a microprocessor

                                                                            57

                                                                            The inherent parallelism of the logic resources on an FPGA allows for

                                                                            considerable computational throughput even at a low MHz clock rates The

                                                                            flexibility of the FPGA allows for even higher performance by trading off

                                                                            precision and range in the number format for an increased number of parallel

                                                                            arithmetic units This has driven a new type of processing called

                                                                            reconfigurable computing where time intensive tasks are offloaded from

                                                                            software to FPGAs

                                                                            The adoption of FPGAs in high performance computing is currently limited by

                                                                            the complexity of FPGA design compared to conventional software and the

                                                                            extremely long turn-around times of current design tools where 4-8 hours wait

                                                                            is necessary after even minor changes to the source code

                                                                            Traditionally FPGAs have been reserved for specific vertical applications

                                                                            where the volume of production is small For these low-volume applications

                                                                            the premium that companies pay in hardware costs per unit for a

                                                                            programmable chip is more affordable than the development resources spent

                                                                            on creating an ASIC for a low-volume application Today new cost and

                                                                            performance dynamics have broadened the range of viable applications

                                                                            Architecture

                                                                            The most common FPGA architecture consists of an array of configurable

                                                                            logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                            channels have the same width (number of wires) Multiple IO pads may fit

                                                                            into the height of one row or the width of one column in the array

                                                                            An application circuit must be mapped into an FPGA with adequate resources

                                                                            While the number of CLBs and IOs required is easily determined from the

                                                                            design the number of routing tracks needed may vary considerably even

                                                                            among designs with the same amount of logic (For example a crossbar switch

                                                                            requires much more routing than a systolic array with the same gate count)

                                                                            Since unused routing tracks increase the cost (and decrease the performance)

                                                                            of the part without providing any benefit FPGA manufacturers try to provide

                                                                            58

                                                                            just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                            can be routed This is determined by estimates such as those derived from

                                                                            Rents rule or by experiments with existing designs

                                                                            The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                            blocks and routing channels Two IO pads fit into the height of one row or the

                                                                            width of one column as shown below All the routing channels have the same

                                                                            width (number of wires)

                                                                            121 FPGA structure

                                                                            A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                            flip-flop as shown below In recent years manufacturers have started moving

                                                                            to 6-input LUTs in their high performance parts claiming increased

                                                                            performance

                                                                            Typical logic block

                                                                            59

                                                                            There is only one output which can be either the registered or the unregistered

                                                                            LUT output The logic block has four inputs for the LUT and a clock input

                                                                            Since clock signals (and often other high-fanout signals) are normally routed

                                                                            via special-purpose dedicated routing networks in commercial FPGAs they

                                                                            and other signals are separately managed

                                                                            For this example architecture the locations of the FPGA logic block pins are

                                                                            shown below

                                                                            Logic Block Pin Locations

                                                                            Each input is accessible from one side of the logic block while the output pin

                                                                            can connect to routing wires in both the channel to the right and the channel

                                                                            below the logic block

                                                                            Each logic block output pin can connect to any of the wiring segments in the

                                                                            channels adjacent to it

                                                                            Similarly an IO pad can connect to any one of the wiring segments in the

                                                                            channel adjacent to it For example an IO pad at the top of the chip can

                                                                            connect to any of the W wires (where W is the channel width) in the horizontal

                                                                            channel immediately below it

                                                                            Generally the FPGA routing is unsegmented That is each wiring segment

                                                                            spans only one logic block before it terminates in a switch box By turning on

                                                                            60

                                                                            some of the programmable switches within a switch box longer paths can be

                                                                            constructed For higher speed interconnect some FPGA architectures use

                                                                            longer routing lines that span multiple logic blocks

                                                                            Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                            this architecture when a wire enters a switch box there are three

                                                                            programmable switches that allow it to connect to three other wires in adjacent

                                                                            channel segments The pattern or topology of switches used in this

                                                                            architecture is the planar or domain-based switch box topology In this switch

                                                                            box topology a wire in track number one connects only to wires in track

                                                                            number one in adjacent channel segments wires in track number 2 connect

                                                                            only to other wires in track number 2 and so on The figure below illustrates

                                                                            the connections in a switch box

                                                                            Switch box topology

                                                                            Modern FPGA families expand upon the above capabilities to include higher

                                                                            level functionality fixed into the silicon Having these common functions

                                                                            embedded into the silicon reduces the area required and gives those functions

                                                                            increased speed compared to building them from primitives Examples of these

                                                                            include multipliers generic DSP blocks embedded processors high speed IO

                                                                            logic and embedded memories

                                                                            61

                                                                            FPGAs are also widely used for systems validation including pre-silicon

                                                                            validation post-silicon validation and firmware development This allows chip

                                                                            companies to validate their design before the chip is produced in the factory

                                                                            reducing the time to market

                                                                            124 FPGA Design and Programming

                                                                            To define the behavior of the FPGA the user provides a hardware description

                                                                            language (HDL) or a schematic design The HDL form might be easier to work

                                                                            with when handling large structures because its possible to just specify them

                                                                            numerically rather than having to draw every piece by hand On the other hand

                                                                            schematic entry can allow for easier visualisation of a design

                                                                            Then using an electronic design automation tool a technology-mapped netlist

                                                                            is generated The netlist can then be fitted to the actual FPGA architecture

                                                                            using a process called place-and-route usually performed by the FPGA

                                                                            companys proprietary place-and-route software The user will validate the

                                                                            map place and route results via timing analysis simulation and other

                                                                            verification methodologies Once the design and validation process is

                                                                            complete the binary file generated (also using the FPGA companys

                                                                            proprietary software) is used to (re)configure the FPGA

                                                                            Going from schematicHDL source files to actual configuration The source

                                                                            files are fed to a software suite from the FPGACPLD vendor that through

                                                                            different steps will produce a file This file is then transferred to the

                                                                            FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                            like an EEPROM

                                                                            The most common HDLs are VHDL and Verilog although in an attempt to

                                                                            reduce the complexity of designing in HDLs which have been compared to the

                                                                            equivalent of assembly languages there are moves to raise the abstraction level

                                                                            through the introduction of alternative languages

                                                                            62

                                                                            To simplify the design of complex systems in FPGAs there exist libraries of

                                                                            predefined complex functions and circuits that have been tested and optimized

                                                                            to speed up the design process These predefined circuits are commonly called

                                                                            IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                            (rarely free and typically released under proprietary licenses) Other

                                                                            predefined circuits are available from developer communities such as Open

                                                                            Cores (typically free and released under the GPL BSD or similar license) and

                                                                            other sources

                                                                            In a typical design flow an FPGA application developer will simulate the

                                                                            design at multiple stages throughout the design process Initially the RTL

                                                                            description in VHDL or Verilog is simulated by creating test benches to

                                                                            simulate the system and observe results Then after the synthesis engine has

                                                                            mapped the design to a netlist the netlist is translated to a gate level

                                                                            description where simulation is repeated to confirm the synthesis proceeded

                                                                            without errors Finally the design is laid out in the FPGA at which point

                                                                            propagation delays can be added and the simulation run again with these

                                                                            values back-annotated onto the netlist

                                                                            Basic Process Technology Types

                                                                            SRAM - based on static memory technology In-system programmable and re-

                                                                            programmable Requires external boot devices CMOS

                                                                            Antifuse - One-time programmable CMOS

                                                                            EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                            one-time programmable in production because of plastic packaging

                                                                            Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                            EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                            technology Can be erased even in plastic packages Some but not all

                                                                            EEPROM devices can be in-system programmed CMOS

                                                                            Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                            packages Some but not all flash devices can be in-system programmed

                                                                            63

                                                                            Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                            therefore less expensive to manufacture CMOS

                                                                            Fuse - One-time programmable Bipolar

                                                                            Major Manufacturers

                                                                            Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                            rivals Together they control over 80 percent of the market with Xilinx alone

                                                                            representing over 50 percent

                                                                            Xilinx also provides free Windows and Linux design software while Altera

                                                                            provides free Windows tools the Solaris and Linux tools are only available via

                                                                            a rental scheme

                                                                            Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                            (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                            Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                            focused CSSP no general purpose FPGAs)

                                                                            125 FPGA prototype

                                                                            FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                            prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                            hardware verification and early software development

                                                                            Main stream verification methods for hardware design and early software and

                                                                            firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                            design on FPGA has become a good method to do this

                                                                            64

                                                                            Reasons why Prototyping is important

                                                                            1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                            functionally correct This is compared to designers only relying on software

                                                                            simulations to verify that their hardware design is sound Simulation speed and

                                                                            modeling accuracy limitations hinder this development

                                                                            2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                            the foundry to start on software tests FPGA prototyping allows for much more

                                                                            time in area of software development and testing at the software-hardware

                                                                            integration stage This allows many unforeseen software bugs that appear due

                                                                            to todays array of operating systems applications and hardware

                                                                            3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                            system work well together off the simulation stage and in actual form

                                                                            4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                            bringing in interest early This speeds up the overall development cycle and

                                                                            allows for more enhancement or improvement to the chip features as it would

                                                                            otherwise have been

                                                                            65

                                                                            13 SIMULATION RESULTS

                                                                            66

                                                                            GATE LEVEL

                                                                            FigVGA CONTROLLER

                                                                            67

                                                                            GATE LEVEL

                                                                            Fig SYNC UNIT

                                                                            68

                                                                            TECHNOLOGY SCHEMATIC

                                                                            69

                                                                            14 APPLICATIONS

                                                                            Motion Tracking Systems

                                                                            Image processing systems

                                                                            Displaying Systems

                                                                            141 CONCLUTION

                                                                            Hence by implementing FPGA taking a processor-generated picture

                                                                            (pixilated) from memory space the Controller provides digital RGB values for

                                                                            each pixel as well as horizontal and vertical synchronization signals in order

                                                                            to correctly display the picture on a connected monitor

                                                                            70

                                                                            REFERENCES

                                                                            1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                            2 A VGA display controller by Eduardo Sanchez

                                                                            3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                            71

                                                                            • Pixel
                                                                              • 6 SIGNALS IN VGA CONTROLLER
                                                                              • Gates
                                                                              • Market size
                                                                              • 122 FPGA Comparisons

                                                                              Circuit is described in terms of how data moves through the system

                                                                              In the dataflow style you describe how information flows between registers in

                                                                              the system

                                                                              The combinational logic is described at a relatively high level the placement

                                                                              and operation of registers is specified quite precisely

                                                                              The behavior of the system over the time is defined by registers

                                                                              There are no build-in registers in VHDL-language

                                                                              ndashEither lower level description

                                                                              ndashor behavioral description of sequential elements is needed

                                                                              The lower level register descriptions must be created or obtained

                                                                              If there is no 3rd party models for registers =gt you must write the behavioral

                                                                              description of registers

                                                                              The behavioral description can be provided in the form of

                                                                              subprograms(functions or procedures)

                                                                              104 Behavioral VHDL Description

                                                                              Circuit is described in terms of its operation over time

                                                                              39

                                                                              Representation might include eg state diagrams timing diagrams and

                                                                              algorithmic descriptions

                                                                              The concept of time may be expressed precisely using delays (eg A lt= B

                                                                              after 10 ns)

                                                                              If no actual delay is used order of sequential operations is defined

                                                                              In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                                              timing specifications

                                                                              The actual timing results depend on implementation technology and efficiency

                                                                              of synthesis tool

                                                                              There are a few tools for behavioral synthesis

                                                                              Concurrent Vs Sequential

                                                                              Processes

                                                                              Basic simulation concept in VHDL

                                                                              VHDL description can always be broken up to interconnected processes

                                                                              Quite similar to Unix process

                                                                              40

                                                                              Process keyword in VHDL

                                                                              Process statement is concurrent statement

                                                                              Statements inside process statements are sequential statements

                                                                              Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                              Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                              General format

                                                                              Process [(sensitivity_list)]

                                                                              process_declarative_part

                                                                              Begin

                                                                              process_statements

                                                                              [wait_statement]

                                                                              End process

                                                                              41

                                                                              11 VGA CONTROLLER CODE

                                                                              library IEEE

                                                                              use IEEESTD_LOGIC_1164ALL

                                                                              use IEEESTD_LOGIC_ARITHALL

                                                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                                                              use IEEEnumeric_stdALL

                                                                              ---- Uncomment the following library declaration if instantiating

                                                                              ---- any Xilinx primitives in this code

                                                                              --library UNISIM

                                                                              --use UNISIMVComponentsall

                                                                              entity vga_controller is port(clk50min std_logic

                                                                              rstnin std_logic

                                                                              hsyncout std_logic

                                                                              vsyncout std_logic

                                                                              redout std_logic_vector(0 downto 0)

                                                                              greenout std_logic_vector(0 downto 0)

                                                                              blueout std_logic_vector(0 downto 0))

                                                                              end vga_controller

                                                                              architecture Behavioral of vga_controller is

                                                                              component memory

                                                                              port (

                                                                              addr IN std_logic_VECTOR(17 downto 0)

                                                                              clk IN std_logic

                                                                              dout OUT std_logic_VECTOR(0 downto 0)

                                                                              en IN std_logic

                                                                              sinit IN std_logic)

                                                                              END component

                                                                              42

                                                                              component sync_unit port(

                                                                              clkin std_logic

                                                                              rstnin std_logic

                                                                              hsyncout std_logic

                                                                              ---horizontal synch pulse(row)

                                                                              vsyncout std_logic

                                                                              --vertical sync pulse(frame) 111110100000000000

                                                                              v_en_regout std_logic

                                                                              h_enout std_logic

                                                                              v_enout std_logic

                                                                              line_countinout integer

                                                                              c_hsinout integer)

                                                                              end component

                                                                              component pixelgen port(

                                                                              clkin std_logic

                                                                              rstnin std_logic

                                                                              hsyncin std_logic

                                                                              vsyncin std_logic

                                                                              h_enin std_logic

                                                                              v_enin std_logic

                                                                              v_en_reg in std_logic

                                                                              data_inin std_logic_vector(0 downto 0)

                                                                              line_countin integer

                                                                              c_hsin integer

                                                                              red out std_logic_vector(0 downto 0)

                                                                              greenout std_logic_vector(0 downto 0)

                                                                              blueout std_logic_vector(0 downto 0)

                                                                              addrout std_logic_vector(17 downto 0)

                                                                              rdout std_logic)

                                                                              end component

                                                                              43

                                                                              signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                              signal line_count_sc_hs_saddrinteger

                                                                              signal data_sstd_logic_vector(0 downto 0)

                                                                              signal addr_sstd_logic_vector(17 downto 0)

                                                                              begin

                                                                              --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                              process(clk50mrstn)

                                                                              begin

                                                                              if(rstn = 0)then

                                                                              clk lt= 0

                                                                              elsif(clk50m = 1 and clk50mevent)then

                                                                              clk lt= not clk

                                                                              end if

                                                                              --end if

                                                                              end process

                                                                              hsync lt= hsync_s

                                                                              vsync lt= vsync_s

                                                                              sync sync_unit port map(

                                                                              clk =gt clk

                                                                              rstn =gt rstn

                                                                              hsync =gt hsync_s

                                                                              vsync =gt vsync_s

                                                                              v_en_reg =gt v_en_reg_s

                                                                              line_count =gt line_count_s

                                                                              h_en =gt h_en_s

                                                                              v_en =gt v_en_s

                                                                              c_hs =gt c_hs_s

                                                                              )

                                                                              44

                                                                              pixel pixelgen port map(

                                                                              clk =gt clk

                                                                              rstn =gt rstn

                                                                              hsync =gt hsync_s

                                                                              vsync =gt vsync_s

                                                                              h_en =gt h_en_s

                                                                              v_en =gt v_en_s

                                                                              v_en_reg =gt v_en_reg_s

                                                                              data_in =gt data_s

                                                                              line_count=gt line_count_s

                                                                              c_hs =gt c_hs_s

                                                                              red =gt red

                                                                              green =gt green

                                                                              blue =gt blue

                                                                              addr =gt addr_s

                                                                              rd =gt rd_s

                                                                              )

                                                                              mem12 memory port map(

                                                                              addr =gt addr_s

                                                                              clk =gt clk

                                                                              dout =gt data_s

                                                                              en =gt rd_s

                                                                              sinit=gt rstn

                                                                              )

                                                                              end Behavioral

                                                                              45

                                                                              SYNCHRONIZATION BLOCK CODE

                                                                              library IEEE

                                                                              use IEEESTD_LOGIC_1164ALL

                                                                              use IEEESTD_LOGIC_ARITHALL

                                                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                                                              entity sync_unit is port(

                                                                              clkin std_logic

                                                                              rstnin std_logic

                                                                              hsyncout std_logic ---horizontal synch pulse(row)

                                                                              vsyncout std_logic --vertical sync

                                                                              pulse(frame) 111110100000000000

                                                                              v_en_regout std_logic

                                                                              h_enout std_logic

                                                                              v_enout std_logic

                                                                              line_countout integer

                                                                              c_hsout integer )

                                                                              end sync_unit

                                                                              architecture arch_sync_unit of sync_unit is

                                                                              signal h_en_sv_en_sstd_logic

                                                                              signal line_count_sc_hs_sinteger

                                                                              begin

                                                                              c_hs lt= c_hs_s

                                                                              line_count lt= line_count_s

                                                                              h_en lt= h_en_s

                                                                              v_en lt= v_en_s

                                                                              -- hsync counter

                                                                              process(clkrstn)

                                                                              begin

                                                                              if(rstn = 0)then

                                                                              46

                                                                              c_hs_s lt= 0

                                                                              else

                                                                              if(clk = 1 and clkevent)then

                                                                              if(c_hs_s=793)then

                                                                              c_hs_s lt= 0

                                                                              else

                                                                              c_hs_s lt= c_hs_s +1

                                                                              end if

                                                                              end if

                                                                              end if

                                                                              end process

                                                                              --vsync counter

                                                                              --vertical line counter

                                                                              process(h_en_srstn)

                                                                              begin

                                                                              if(rstn = 0)then

                                                                              line_count_s lt= 0

                                                                              elsif(h_en_s = 1 and h_en_sevent)then

                                                                              line_count_s lt= line_count_s + 1

                                                                              if (line_count_s=515)then

                                                                              line_count_s lt= 0

                                                                              end if

                                                                              end if

                                                                              --end if

                                                                              end process

                                                                              --hysnc pulse

                                                                              process(clkrstn)

                                                                              begin

                                                                              if(rstn = 0)then

                                                                              47

                                                                              hsync lt= 1

                                                                              elsif(clk = 1 and clkevent)then

                                                                              if(c_hs_s lt=95)then

                                                                              hsync lt= 0

                                                                              else

                                                                              hsync lt= 1

                                                                              end if

                                                                              end if

                                                                              --end if

                                                                              end process

                                                                              process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                              this period)

                                                                              begin

                                                                              if(rstn = 0)then

                                                                              h_en_s lt= 0

                                                                              elsif(clk = 1 and clkevent)then

                                                                              if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                              h_en_s lt= 1

                                                                              else

                                                                              h_en_s lt= 0

                                                                              end if

                                                                              end if

                                                                              --end if

                                                                              end process

                                                                              ---vysnc pulse

                                                                              process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                              begin

                                                                              if(rstn = 0)then

                                                                              48

                                                                              v_en_s lt= 0

                                                                              elsif(clk = 1 and clkevent)then

                                                                              if((line_count_sgt=35 and line_count_slt=515))then

                                                                              v_en_s lt= 1

                                                                              else

                                                                              v_en_s lt= 0

                                                                              end if

                                                                              end if

                                                                              --end if

                                                                              end process

                                                                              process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                              begin

                                                                              if(rstn = 0)then

                                                                              vsync lt= 1

                                                                              elsif(clk = 1 and clkevent)then

                                                                              if(line_count_s lt2)then

                                                                              vsync lt= 0

                                                                              else

                                                                              vsync lt= 1

                                                                              end if

                                                                              end if

                                                                              --end if

                                                                              end process

                                                                              process(clk)

                                                                              begin

                                                                              if(clk = 1 and clkevent)then

                                                                              v_en_reglt=v_en_s

                                                                              end if

                                                                              end process

                                                                              end arch_sync_unit

                                                                              49

                                                                              PIXEL GENERATOR

                                                                              library IEEE

                                                                              use IEEESTD_LOGIC_1164ALL

                                                                              use IEEESTD_LOGIC_ARITHALL

                                                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                                                              entity pixelgen is port(

                                                                              clkin std_logic

                                                                              rstnin std_logic

                                                                              hsyncin std_logic

                                                                              vsyncin std_logic

                                                                              h_enin std_logic

                                                                              v_enin std_logic

                                                                              v_en_reg in std_logic

                                                                              data_inin std_logic_vector(0 downto 0)

                                                                              line_countin integer

                                                                              c_hsin integer

                                                                              red out std_logic_vector(0 downto 0)

                                                                              greenout std_logic_vector(0 downto 0)

                                                                              blueout std_logic_vector(0 downto 0)

                                                                              addrout std_logic_vector(17 downto 0)

                                                                              rdout std_logic

                                                                              )

                                                                              end pixelgen

                                                                              architecture behv of pixelgen is

                                                                              signal addr_sstd_logic_vector(17 downto 0)

                                                                              begin

                                                                              addr lt= addr_s

                                                                              process(clkrstn)

                                                                              begin

                                                                              50

                                                                              if(rstn = 0)then

                                                                              rd lt= 0

                                                                              red lt= 0

                                                                              green lt= 0

                                                                              blue lt= 0

                                                                              addr_s lt= x0000 amp 00

                                                                              elsif (clk = 1 and clkevent)then

                                                                              if((v_en=1) and (v_en_reg=0))then

                                                                              addr_slt=(others =gt0)

                                                                              end if

                                                                              if(v_en=1 and h_en=1)then

                                                                              if(line_countlt=514)then

                                                                              if(c_hs lt= 655)then

                                                                              rd lt= 1

                                                                              red lt= data_in

                                                                              green lt= data_in

                                                                              blue lt= data_in

                                                                              addr_s lt=addr_s +000000000000000001

                                                                              else

                                                                              rd lt= 0

                                                                              red lt= 0

                                                                              green lt= 1

                                                                              blue lt= 0

                                                                              end if

                                                                              else

                                                                              rd lt= 0

                                                                              red lt= 1

                                                                              green lt= 1

                                                                              blue lt= 1

                                                                              51

                                                                              end if

                                                                              if(addr_s=111011111111111111)then

                                                                              addr_slt= (others =gt0)

                                                                              end if

                                                                              else

                                                                              red lt= 0

                                                                              green lt= 0

                                                                              blue lt= 0

                                                                              end if

                                                                              end if

                                                                              -- end if

                                                                              --end if

                                                                              end process

                                                                              end behv

                                                                              52

                                                                              12 FPGA ARCHITECTURE

                                                                              A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                              be configured by the customer or designer after manufacturingmdashhence the

                                                                              name field-programmable FPGAs are programmed using a logic circuit

                                                                              diagram or a source code in a hardware description language (HDL) to specify

                                                                              how the chip will work They can be used to implement any logical function

                                                                              that an application-specific integrated circuit (ASIC) could perform but the

                                                                              ability to update the functionality after shipping offers advantages for many

                                                                              applications

                                                                              FPGAs contain programmable logic components called logic blocks and a

                                                                              hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                              togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                              can be configured to perform complex combinational functions or merely

                                                                              simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                              include memory elements which may be simple flip-flops or more complete

                                                                              blocks of memory

                                                                              History The FPGA industry sprouted from programmable read only

                                                                              memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                              both had the option of being programmed in batches in a factory or in the field

                                                                              (field programmable) however programmable logic was hard-wired between

                                                                              logic gates

                                                                              Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                              first commercially viable field programmable gate array in 1985 ndash the

                                                                              XC2064 The XC2064 had programmable gates and programmable

                                                                              interconnects between gates the beginnings of a new technology and market

                                                                              The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                              input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                              into the National Inventors Hall of Fame for his invention

                                                                              53

                                                                              Some of the industryrsquos foundational concepts and technologies for

                                                                              programmable logic arrays gates and logic blocks are founded in patents

                                                                              awarded to David W Page and LuVerne R Peterson in 1985

                                                                              In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                              proposed by Steve Casselman to develop a computer that would implement

                                                                              600000 reprogrammable gates Casselman was successful and the system was

                                                                              awarded a patent in 1992

                                                                              Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                              1990s when competitors sprouted up eroding significant market-share By

                                                                              1993 Actel was serving about 18 percent of the market

                                                                              The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                              and the volume of production In the early 1990s FPGAs were primarily used

                                                                              in telecommunications and networking By the end of the decade FPGAs

                                                                              found their way into consumer automotive and industrial applications

                                                                              FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                              algorithm technology and FPGAs to create a sound recognition device

                                                                              Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                              to decide the configuration needed to accomplish a sound recognition task

                                                                              121 Modern developments

                                                                              A recent trend has been to take the coarse-grained architectural

                                                                              approach a step further by combining the logic blocks and interconnects of

                                                                              traditional FPGAs with embedded microprocessors and related peripherals to

                                                                              form a complete system on a programmable chip This work mirrors the

                                                                              architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                              Group which combined a reconfigurable CPU architecture on a single chip

                                                                              called the SB24 That work was done in 1982 Examples of such hybrid

                                                                              technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                              which include one or more PowerPC processors embedded within the FPGAs

                                                                              54

                                                                              logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                              processor in combination with Atmels programmable logic architecture

                                                                              An alternate approach to using hard-macro processors is to make use of soft

                                                                              processor cores that are implemented within the FPGA logic (See Soft

                                                                              processors below)

                                                                              As previously mentioned many modern FPGAs have the ability to be

                                                                              reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                              computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                              suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                              example of a reconfigurable soft processor implemented on FPGAs However

                                                                              it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                              to a specific program

                                                                              Additionally new non-FPGA architectures are beginning to emerge

                                                                              Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                              hybrid approach by providing an array of processor cores and FPGA-like

                                                                              programmable cores on the same chip

                                                                              Gates

                                                                              1987 9000 gates Xilinx

                                                                              1992 600000 Naval Surface Warfare Department

                                                                              Early 2000s Millions

                                                                              Market size

                                                                              1985 First commercial FPGA technology invented by Xilinx

                                                                              1987 $14 million

                                                                              ~1993 gt$385 million

                                                                              2005 $19 billion

                                                                              2010 estimates $275 billion

                                                                              55

                                                                              122 FPGA Comparisons

                                                                              Historically FPGAs have been slower less energy efficient and generally

                                                                              achieved less functionality than their fixed ASIC counterparts A combination

                                                                              of volume fabrication improvements research and development and the IO

                                                                              capabilities of new supercomputers have largely closed the performance gap

                                                                              between ASICs and FPGAs

                                                                              Advantages include a shorter time to market ability to re-program in the field

                                                                              to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                              middle road by developing their hardware on ordinary FPGAs but

                                                                              manufacture their final version so it can no longer be modified after the design

                                                                              has been committed

                                                                              Xilinx claims that several market and technology dynamics are changing the

                                                                              ASICFPGA paradigm

                                                                              IC costs are rising aggressively

                                                                              ASIC complexity has bolstered development time and costs

                                                                              RampD resources and headcount is decreasing

                                                                              Revenue losses for slow time-to-market are increasing

                                                                              Financial constraints in a poor economy are driving low-cost technologies

                                                                              These trends make FPGAs a better alternative than ASICs for a growing

                                                                              number of higher-volume applications than they have been historically used

                                                                              for which the company blames for the growing number of FPGA design starts

                                                                              (see History)

                                                                              The primary differences between CPLDs and FPGAs are architectural A

                                                                              CPLD has a somewhat restrictive structure consisting of one or more

                                                                              programmable sum-of-products logic arrays feeding a relatively small number

                                                                              of clocked registers The result of this is less flexibility with the advantage of

                                                                              more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                              FPGA architectures on the other hand are dominated by interconnect This

                                                                              56

                                                                              makes them far more flexible (in terms of the range of designs that are

                                                                              practical for implementation within them) but also far more complex to design

                                                                              for

                                                                              Another notable difference between CPLDs and FPGAs is the presence in most

                                                                              FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                              and embedded memories as well as to have logic blocks implement decoders

                                                                              or mathematical functions

                                                                              Some FPGAs have the capability of partial re-configuration that lets one

                                                                              portion of the device be re-programmed while other portions continue running

                                                                              123 Applications

                                                                              Applications of FPGAs include digital signal processing software-defined

                                                                              radio aerospace and defense systems ASIC prototyping medical imaging

                                                                              computer vision speech recognition cryptography bioinformatics computer

                                                                              hardware emulation radio astronomy and a growing range of other areas

                                                                              FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                              space that of glue logic for PCBs As their size capabilities and speed

                                                                              increased they began to take over larger and larger functions to the state where

                                                                              some are now marketed as full systems on chips (SoC) Particularly with the

                                                                              introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                              applications which had traditionally been the sole reserve of DSPs began to

                                                                              incorporate FPGAs instead

                                                                              FPGAs especially find applications in any area or algorithm that can make use

                                                                              of the massive parallelism offered by their architecture One such area is code

                                                                              breaking in particular brute-force attack of cryptographic algorithms

                                                                              FPGAs are increasingly used in conventional high performance computing

                                                                              applications where computational kernels such as FFT or Convolution are

                                                                              performed on the FPGA instead of a microprocessor

                                                                              57

                                                                              The inherent parallelism of the logic resources on an FPGA allows for

                                                                              considerable computational throughput even at a low MHz clock rates The

                                                                              flexibility of the FPGA allows for even higher performance by trading off

                                                                              precision and range in the number format for an increased number of parallel

                                                                              arithmetic units This has driven a new type of processing called

                                                                              reconfigurable computing where time intensive tasks are offloaded from

                                                                              software to FPGAs

                                                                              The adoption of FPGAs in high performance computing is currently limited by

                                                                              the complexity of FPGA design compared to conventional software and the

                                                                              extremely long turn-around times of current design tools where 4-8 hours wait

                                                                              is necessary after even minor changes to the source code

                                                                              Traditionally FPGAs have been reserved for specific vertical applications

                                                                              where the volume of production is small For these low-volume applications

                                                                              the premium that companies pay in hardware costs per unit for a

                                                                              programmable chip is more affordable than the development resources spent

                                                                              on creating an ASIC for a low-volume application Today new cost and

                                                                              performance dynamics have broadened the range of viable applications

                                                                              Architecture

                                                                              The most common FPGA architecture consists of an array of configurable

                                                                              logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                              channels have the same width (number of wires) Multiple IO pads may fit

                                                                              into the height of one row or the width of one column in the array

                                                                              An application circuit must be mapped into an FPGA with adequate resources

                                                                              While the number of CLBs and IOs required is easily determined from the

                                                                              design the number of routing tracks needed may vary considerably even

                                                                              among designs with the same amount of logic (For example a crossbar switch

                                                                              requires much more routing than a systolic array with the same gate count)

                                                                              Since unused routing tracks increase the cost (and decrease the performance)

                                                                              of the part without providing any benefit FPGA manufacturers try to provide

                                                                              58

                                                                              just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                              can be routed This is determined by estimates such as those derived from

                                                                              Rents rule or by experiments with existing designs

                                                                              The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                              blocks and routing channels Two IO pads fit into the height of one row or the

                                                                              width of one column as shown below All the routing channels have the same

                                                                              width (number of wires)

                                                                              121 FPGA structure

                                                                              A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                              flip-flop as shown below In recent years manufacturers have started moving

                                                                              to 6-input LUTs in their high performance parts claiming increased

                                                                              performance

                                                                              Typical logic block

                                                                              59

                                                                              There is only one output which can be either the registered or the unregistered

                                                                              LUT output The logic block has four inputs for the LUT and a clock input

                                                                              Since clock signals (and often other high-fanout signals) are normally routed

                                                                              via special-purpose dedicated routing networks in commercial FPGAs they

                                                                              and other signals are separately managed

                                                                              For this example architecture the locations of the FPGA logic block pins are

                                                                              shown below

                                                                              Logic Block Pin Locations

                                                                              Each input is accessible from one side of the logic block while the output pin

                                                                              can connect to routing wires in both the channel to the right and the channel

                                                                              below the logic block

                                                                              Each logic block output pin can connect to any of the wiring segments in the

                                                                              channels adjacent to it

                                                                              Similarly an IO pad can connect to any one of the wiring segments in the

                                                                              channel adjacent to it For example an IO pad at the top of the chip can

                                                                              connect to any of the W wires (where W is the channel width) in the horizontal

                                                                              channel immediately below it

                                                                              Generally the FPGA routing is unsegmented That is each wiring segment

                                                                              spans only one logic block before it terminates in a switch box By turning on

                                                                              60

                                                                              some of the programmable switches within a switch box longer paths can be

                                                                              constructed For higher speed interconnect some FPGA architectures use

                                                                              longer routing lines that span multiple logic blocks

                                                                              Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                              this architecture when a wire enters a switch box there are three

                                                                              programmable switches that allow it to connect to three other wires in adjacent

                                                                              channel segments The pattern or topology of switches used in this

                                                                              architecture is the planar or domain-based switch box topology In this switch

                                                                              box topology a wire in track number one connects only to wires in track

                                                                              number one in adjacent channel segments wires in track number 2 connect

                                                                              only to other wires in track number 2 and so on The figure below illustrates

                                                                              the connections in a switch box

                                                                              Switch box topology

                                                                              Modern FPGA families expand upon the above capabilities to include higher

                                                                              level functionality fixed into the silicon Having these common functions

                                                                              embedded into the silicon reduces the area required and gives those functions

                                                                              increased speed compared to building them from primitives Examples of these

                                                                              include multipliers generic DSP blocks embedded processors high speed IO

                                                                              logic and embedded memories

                                                                              61

                                                                              FPGAs are also widely used for systems validation including pre-silicon

                                                                              validation post-silicon validation and firmware development This allows chip

                                                                              companies to validate their design before the chip is produced in the factory

                                                                              reducing the time to market

                                                                              124 FPGA Design and Programming

                                                                              To define the behavior of the FPGA the user provides a hardware description

                                                                              language (HDL) or a schematic design The HDL form might be easier to work

                                                                              with when handling large structures because its possible to just specify them

                                                                              numerically rather than having to draw every piece by hand On the other hand

                                                                              schematic entry can allow for easier visualisation of a design

                                                                              Then using an electronic design automation tool a technology-mapped netlist

                                                                              is generated The netlist can then be fitted to the actual FPGA architecture

                                                                              using a process called place-and-route usually performed by the FPGA

                                                                              companys proprietary place-and-route software The user will validate the

                                                                              map place and route results via timing analysis simulation and other

                                                                              verification methodologies Once the design and validation process is

                                                                              complete the binary file generated (also using the FPGA companys

                                                                              proprietary software) is used to (re)configure the FPGA

                                                                              Going from schematicHDL source files to actual configuration The source

                                                                              files are fed to a software suite from the FPGACPLD vendor that through

                                                                              different steps will produce a file This file is then transferred to the

                                                                              FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                              like an EEPROM

                                                                              The most common HDLs are VHDL and Verilog although in an attempt to

                                                                              reduce the complexity of designing in HDLs which have been compared to the

                                                                              equivalent of assembly languages there are moves to raise the abstraction level

                                                                              through the introduction of alternative languages

                                                                              62

                                                                              To simplify the design of complex systems in FPGAs there exist libraries of

                                                                              predefined complex functions and circuits that have been tested and optimized

                                                                              to speed up the design process These predefined circuits are commonly called

                                                                              IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                              (rarely free and typically released under proprietary licenses) Other

                                                                              predefined circuits are available from developer communities such as Open

                                                                              Cores (typically free and released under the GPL BSD or similar license) and

                                                                              other sources

                                                                              In a typical design flow an FPGA application developer will simulate the

                                                                              design at multiple stages throughout the design process Initially the RTL

                                                                              description in VHDL or Verilog is simulated by creating test benches to

                                                                              simulate the system and observe results Then after the synthesis engine has

                                                                              mapped the design to a netlist the netlist is translated to a gate level

                                                                              description where simulation is repeated to confirm the synthesis proceeded

                                                                              without errors Finally the design is laid out in the FPGA at which point

                                                                              propagation delays can be added and the simulation run again with these

                                                                              values back-annotated onto the netlist

                                                                              Basic Process Technology Types

                                                                              SRAM - based on static memory technology In-system programmable and re-

                                                                              programmable Requires external boot devices CMOS

                                                                              Antifuse - One-time programmable CMOS

                                                                              EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                              one-time programmable in production because of plastic packaging

                                                                              Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                              EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                              technology Can be erased even in plastic packages Some but not all

                                                                              EEPROM devices can be in-system programmed CMOS

                                                                              Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                              packages Some but not all flash devices can be in-system programmed

                                                                              63

                                                                              Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                              therefore less expensive to manufacture CMOS

                                                                              Fuse - One-time programmable Bipolar

                                                                              Major Manufacturers

                                                                              Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                              rivals Together they control over 80 percent of the market with Xilinx alone

                                                                              representing over 50 percent

                                                                              Xilinx also provides free Windows and Linux design software while Altera

                                                                              provides free Windows tools the Solaris and Linux tools are only available via

                                                                              a rental scheme

                                                                              Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                              (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                              Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                              focused CSSP no general purpose FPGAs)

                                                                              125 FPGA prototype

                                                                              FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                              prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                              hardware verification and early software development

                                                                              Main stream verification methods for hardware design and early software and

                                                                              firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                              design on FPGA has become a good method to do this

                                                                              64

                                                                              Reasons why Prototyping is important

                                                                              1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                              functionally correct This is compared to designers only relying on software

                                                                              simulations to verify that their hardware design is sound Simulation speed and

                                                                              modeling accuracy limitations hinder this development

                                                                              2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                              the foundry to start on software tests FPGA prototyping allows for much more

                                                                              time in area of software development and testing at the software-hardware

                                                                              integration stage This allows many unforeseen software bugs that appear due

                                                                              to todays array of operating systems applications and hardware

                                                                              3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                              system work well together off the simulation stage and in actual form

                                                                              4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                              bringing in interest early This speeds up the overall development cycle and

                                                                              allows for more enhancement or improvement to the chip features as it would

                                                                              otherwise have been

                                                                              65

                                                                              13 SIMULATION RESULTS

                                                                              66

                                                                              GATE LEVEL

                                                                              FigVGA CONTROLLER

                                                                              67

                                                                              GATE LEVEL

                                                                              Fig SYNC UNIT

                                                                              68

                                                                              TECHNOLOGY SCHEMATIC

                                                                              69

                                                                              14 APPLICATIONS

                                                                              Motion Tracking Systems

                                                                              Image processing systems

                                                                              Displaying Systems

                                                                              141 CONCLUTION

                                                                              Hence by implementing FPGA taking a processor-generated picture

                                                                              (pixilated) from memory space the Controller provides digital RGB values for

                                                                              each pixel as well as horizontal and vertical synchronization signals in order

                                                                              to correctly display the picture on a connected monitor

                                                                              70

                                                                              REFERENCES

                                                                              1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                              2 A VGA display controller by Eduardo Sanchez

                                                                              3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                              71

                                                                              • Pixel
                                                                                • 6 SIGNALS IN VGA CONTROLLER
                                                                                • Gates
                                                                                • Market size
                                                                                • 122 FPGA Comparisons

                                                                                Representation might include eg state diagrams timing diagrams and

                                                                                algorithmic descriptions

                                                                                The concept of time may be expressed precisely using delays (eg A lt= B

                                                                                after 10 ns)

                                                                                If no actual delay is used order of sequential operations is defined

                                                                                In the lower levels of abstraction (eg RTL) synthesis tools ignore detailed

                                                                                timing specifications

                                                                                The actual timing results depend on implementation technology and efficiency

                                                                                of synthesis tool

                                                                                There are a few tools for behavioral synthesis

                                                                                Concurrent Vs Sequential

                                                                                Processes

                                                                                Basic simulation concept in VHDL

                                                                                VHDL description can always be broken up to interconnected processes

                                                                                Quite similar to Unix process

                                                                                40

                                                                                Process keyword in VHDL

                                                                                Process statement is concurrent statement

                                                                                Statements inside process statements are sequential statements

                                                                                Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                                Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                                General format

                                                                                Process [(sensitivity_list)]

                                                                                process_declarative_part

                                                                                Begin

                                                                                process_statements

                                                                                [wait_statement]

                                                                                End process

                                                                                41

                                                                                11 VGA CONTROLLER CODE

                                                                                library IEEE

                                                                                use IEEESTD_LOGIC_1164ALL

                                                                                use IEEESTD_LOGIC_ARITHALL

                                                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                use IEEEnumeric_stdALL

                                                                                ---- Uncomment the following library declaration if instantiating

                                                                                ---- any Xilinx primitives in this code

                                                                                --library UNISIM

                                                                                --use UNISIMVComponentsall

                                                                                entity vga_controller is port(clk50min std_logic

                                                                                rstnin std_logic

                                                                                hsyncout std_logic

                                                                                vsyncout std_logic

                                                                                redout std_logic_vector(0 downto 0)

                                                                                greenout std_logic_vector(0 downto 0)

                                                                                blueout std_logic_vector(0 downto 0))

                                                                                end vga_controller

                                                                                architecture Behavioral of vga_controller is

                                                                                component memory

                                                                                port (

                                                                                addr IN std_logic_VECTOR(17 downto 0)

                                                                                clk IN std_logic

                                                                                dout OUT std_logic_VECTOR(0 downto 0)

                                                                                en IN std_logic

                                                                                sinit IN std_logic)

                                                                                END component

                                                                                42

                                                                                component sync_unit port(

                                                                                clkin std_logic

                                                                                rstnin std_logic

                                                                                hsyncout std_logic

                                                                                ---horizontal synch pulse(row)

                                                                                vsyncout std_logic

                                                                                --vertical sync pulse(frame) 111110100000000000

                                                                                v_en_regout std_logic

                                                                                h_enout std_logic

                                                                                v_enout std_logic

                                                                                line_countinout integer

                                                                                c_hsinout integer)

                                                                                end component

                                                                                component pixelgen port(

                                                                                clkin std_logic

                                                                                rstnin std_logic

                                                                                hsyncin std_logic

                                                                                vsyncin std_logic

                                                                                h_enin std_logic

                                                                                v_enin std_logic

                                                                                v_en_reg in std_logic

                                                                                data_inin std_logic_vector(0 downto 0)

                                                                                line_countin integer

                                                                                c_hsin integer

                                                                                red out std_logic_vector(0 downto 0)

                                                                                greenout std_logic_vector(0 downto 0)

                                                                                blueout std_logic_vector(0 downto 0)

                                                                                addrout std_logic_vector(17 downto 0)

                                                                                rdout std_logic)

                                                                                end component

                                                                                43

                                                                                signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                                signal line_count_sc_hs_saddrinteger

                                                                                signal data_sstd_logic_vector(0 downto 0)

                                                                                signal addr_sstd_logic_vector(17 downto 0)

                                                                                begin

                                                                                --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                                process(clk50mrstn)

                                                                                begin

                                                                                if(rstn = 0)then

                                                                                clk lt= 0

                                                                                elsif(clk50m = 1 and clk50mevent)then

                                                                                clk lt= not clk

                                                                                end if

                                                                                --end if

                                                                                end process

                                                                                hsync lt= hsync_s

                                                                                vsync lt= vsync_s

                                                                                sync sync_unit port map(

                                                                                clk =gt clk

                                                                                rstn =gt rstn

                                                                                hsync =gt hsync_s

                                                                                vsync =gt vsync_s

                                                                                v_en_reg =gt v_en_reg_s

                                                                                line_count =gt line_count_s

                                                                                h_en =gt h_en_s

                                                                                v_en =gt v_en_s

                                                                                c_hs =gt c_hs_s

                                                                                )

                                                                                44

                                                                                pixel pixelgen port map(

                                                                                clk =gt clk

                                                                                rstn =gt rstn

                                                                                hsync =gt hsync_s

                                                                                vsync =gt vsync_s

                                                                                h_en =gt h_en_s

                                                                                v_en =gt v_en_s

                                                                                v_en_reg =gt v_en_reg_s

                                                                                data_in =gt data_s

                                                                                line_count=gt line_count_s

                                                                                c_hs =gt c_hs_s

                                                                                red =gt red

                                                                                green =gt green

                                                                                blue =gt blue

                                                                                addr =gt addr_s

                                                                                rd =gt rd_s

                                                                                )

                                                                                mem12 memory port map(

                                                                                addr =gt addr_s

                                                                                clk =gt clk

                                                                                dout =gt data_s

                                                                                en =gt rd_s

                                                                                sinit=gt rstn

                                                                                )

                                                                                end Behavioral

                                                                                45

                                                                                SYNCHRONIZATION BLOCK CODE

                                                                                library IEEE

                                                                                use IEEESTD_LOGIC_1164ALL

                                                                                use IEEESTD_LOGIC_ARITHALL

                                                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                entity sync_unit is port(

                                                                                clkin std_logic

                                                                                rstnin std_logic

                                                                                hsyncout std_logic ---horizontal synch pulse(row)

                                                                                vsyncout std_logic --vertical sync

                                                                                pulse(frame) 111110100000000000

                                                                                v_en_regout std_logic

                                                                                h_enout std_logic

                                                                                v_enout std_logic

                                                                                line_countout integer

                                                                                c_hsout integer )

                                                                                end sync_unit

                                                                                architecture arch_sync_unit of sync_unit is

                                                                                signal h_en_sv_en_sstd_logic

                                                                                signal line_count_sc_hs_sinteger

                                                                                begin

                                                                                c_hs lt= c_hs_s

                                                                                line_count lt= line_count_s

                                                                                h_en lt= h_en_s

                                                                                v_en lt= v_en_s

                                                                                -- hsync counter

                                                                                process(clkrstn)

                                                                                begin

                                                                                if(rstn = 0)then

                                                                                46

                                                                                c_hs_s lt= 0

                                                                                else

                                                                                if(clk = 1 and clkevent)then

                                                                                if(c_hs_s=793)then

                                                                                c_hs_s lt= 0

                                                                                else

                                                                                c_hs_s lt= c_hs_s +1

                                                                                end if

                                                                                end if

                                                                                end if

                                                                                end process

                                                                                --vsync counter

                                                                                --vertical line counter

                                                                                process(h_en_srstn)

                                                                                begin

                                                                                if(rstn = 0)then

                                                                                line_count_s lt= 0

                                                                                elsif(h_en_s = 1 and h_en_sevent)then

                                                                                line_count_s lt= line_count_s + 1

                                                                                if (line_count_s=515)then

                                                                                line_count_s lt= 0

                                                                                end if

                                                                                end if

                                                                                --end if

                                                                                end process

                                                                                --hysnc pulse

                                                                                process(clkrstn)

                                                                                begin

                                                                                if(rstn = 0)then

                                                                                47

                                                                                hsync lt= 1

                                                                                elsif(clk = 1 and clkevent)then

                                                                                if(c_hs_s lt=95)then

                                                                                hsync lt= 0

                                                                                else

                                                                                hsync lt= 1

                                                                                end if

                                                                                end if

                                                                                --end if

                                                                                end process

                                                                                process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                                this period)

                                                                                begin

                                                                                if(rstn = 0)then

                                                                                h_en_s lt= 0

                                                                                elsif(clk = 1 and clkevent)then

                                                                                if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                                h_en_s lt= 1

                                                                                else

                                                                                h_en_s lt= 0

                                                                                end if

                                                                                end if

                                                                                --end if

                                                                                end process

                                                                                ---vysnc pulse

                                                                                process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                                begin

                                                                                if(rstn = 0)then

                                                                                48

                                                                                v_en_s lt= 0

                                                                                elsif(clk = 1 and clkevent)then

                                                                                if((line_count_sgt=35 and line_count_slt=515))then

                                                                                v_en_s lt= 1

                                                                                else

                                                                                v_en_s lt= 0

                                                                                end if

                                                                                end if

                                                                                --end if

                                                                                end process

                                                                                process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                begin

                                                                                if(rstn = 0)then

                                                                                vsync lt= 1

                                                                                elsif(clk = 1 and clkevent)then

                                                                                if(line_count_s lt2)then

                                                                                vsync lt= 0

                                                                                else

                                                                                vsync lt= 1

                                                                                end if

                                                                                end if

                                                                                --end if

                                                                                end process

                                                                                process(clk)

                                                                                begin

                                                                                if(clk = 1 and clkevent)then

                                                                                v_en_reglt=v_en_s

                                                                                end if

                                                                                end process

                                                                                end arch_sync_unit

                                                                                49

                                                                                PIXEL GENERATOR

                                                                                library IEEE

                                                                                use IEEESTD_LOGIC_1164ALL

                                                                                use IEEESTD_LOGIC_ARITHALL

                                                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                entity pixelgen is port(

                                                                                clkin std_logic

                                                                                rstnin std_logic

                                                                                hsyncin std_logic

                                                                                vsyncin std_logic

                                                                                h_enin std_logic

                                                                                v_enin std_logic

                                                                                v_en_reg in std_logic

                                                                                data_inin std_logic_vector(0 downto 0)

                                                                                line_countin integer

                                                                                c_hsin integer

                                                                                red out std_logic_vector(0 downto 0)

                                                                                greenout std_logic_vector(0 downto 0)

                                                                                blueout std_logic_vector(0 downto 0)

                                                                                addrout std_logic_vector(17 downto 0)

                                                                                rdout std_logic

                                                                                )

                                                                                end pixelgen

                                                                                architecture behv of pixelgen is

                                                                                signal addr_sstd_logic_vector(17 downto 0)

                                                                                begin

                                                                                addr lt= addr_s

                                                                                process(clkrstn)

                                                                                begin

                                                                                50

                                                                                if(rstn = 0)then

                                                                                rd lt= 0

                                                                                red lt= 0

                                                                                green lt= 0

                                                                                blue lt= 0

                                                                                addr_s lt= x0000 amp 00

                                                                                elsif (clk = 1 and clkevent)then

                                                                                if((v_en=1) and (v_en_reg=0))then

                                                                                addr_slt=(others =gt0)

                                                                                end if

                                                                                if(v_en=1 and h_en=1)then

                                                                                if(line_countlt=514)then

                                                                                if(c_hs lt= 655)then

                                                                                rd lt= 1

                                                                                red lt= data_in

                                                                                green lt= data_in

                                                                                blue lt= data_in

                                                                                addr_s lt=addr_s +000000000000000001

                                                                                else

                                                                                rd lt= 0

                                                                                red lt= 0

                                                                                green lt= 1

                                                                                blue lt= 0

                                                                                end if

                                                                                else

                                                                                rd lt= 0

                                                                                red lt= 1

                                                                                green lt= 1

                                                                                blue lt= 1

                                                                                51

                                                                                end if

                                                                                if(addr_s=111011111111111111)then

                                                                                addr_slt= (others =gt0)

                                                                                end if

                                                                                else

                                                                                red lt= 0

                                                                                green lt= 0

                                                                                blue lt= 0

                                                                                end if

                                                                                end if

                                                                                -- end if

                                                                                --end if

                                                                                end process

                                                                                end behv

                                                                                52

                                                                                12 FPGA ARCHITECTURE

                                                                                A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                be configured by the customer or designer after manufacturingmdashhence the

                                                                                name field-programmable FPGAs are programmed using a logic circuit

                                                                                diagram or a source code in a hardware description language (HDL) to specify

                                                                                how the chip will work They can be used to implement any logical function

                                                                                that an application-specific integrated circuit (ASIC) could perform but the

                                                                                ability to update the functionality after shipping offers advantages for many

                                                                                applications

                                                                                FPGAs contain programmable logic components called logic blocks and a

                                                                                hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                can be configured to perform complex combinational functions or merely

                                                                                simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                include memory elements which may be simple flip-flops or more complete

                                                                                blocks of memory

                                                                                History The FPGA industry sprouted from programmable read only

                                                                                memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                both had the option of being programmed in batches in a factory or in the field

                                                                                (field programmable) however programmable logic was hard-wired between

                                                                                logic gates

                                                                                Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                first commercially viable field programmable gate array in 1985 ndash the

                                                                                XC2064 The XC2064 had programmable gates and programmable

                                                                                interconnects between gates the beginnings of a new technology and market

                                                                                The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                into the National Inventors Hall of Fame for his invention

                                                                                53

                                                                                Some of the industryrsquos foundational concepts and technologies for

                                                                                programmable logic arrays gates and logic blocks are founded in patents

                                                                                awarded to David W Page and LuVerne R Peterson in 1985

                                                                                In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                proposed by Steve Casselman to develop a computer that would implement

                                                                                600000 reprogrammable gates Casselman was successful and the system was

                                                                                awarded a patent in 1992

                                                                                Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                1990s when competitors sprouted up eroding significant market-share By

                                                                                1993 Actel was serving about 18 percent of the market

                                                                                The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                and the volume of production In the early 1990s FPGAs were primarily used

                                                                                in telecommunications and networking By the end of the decade FPGAs

                                                                                found their way into consumer automotive and industrial applications

                                                                                FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                algorithm technology and FPGAs to create a sound recognition device

                                                                                Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                to decide the configuration needed to accomplish a sound recognition task

                                                                                121 Modern developments

                                                                                A recent trend has been to take the coarse-grained architectural

                                                                                approach a step further by combining the logic blocks and interconnects of

                                                                                traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                form a complete system on a programmable chip This work mirrors the

                                                                                architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                Group which combined a reconfigurable CPU architecture on a single chip

                                                                                called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                which include one or more PowerPC processors embedded within the FPGAs

                                                                                54

                                                                                logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                processor in combination with Atmels programmable logic architecture

                                                                                An alternate approach to using hard-macro processors is to make use of soft

                                                                                processor cores that are implemented within the FPGA logic (See Soft

                                                                                processors below)

                                                                                As previously mentioned many modern FPGAs have the ability to be

                                                                                reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                example of a reconfigurable soft processor implemented on FPGAs However

                                                                                it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                to a specific program

                                                                                Additionally new non-FPGA architectures are beginning to emerge

                                                                                Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                hybrid approach by providing an array of processor cores and FPGA-like

                                                                                programmable cores on the same chip

                                                                                Gates

                                                                                1987 9000 gates Xilinx

                                                                                1992 600000 Naval Surface Warfare Department

                                                                                Early 2000s Millions

                                                                                Market size

                                                                                1985 First commercial FPGA technology invented by Xilinx

                                                                                1987 $14 million

                                                                                ~1993 gt$385 million

                                                                                2005 $19 billion

                                                                                2010 estimates $275 billion

                                                                                55

                                                                                122 FPGA Comparisons

                                                                                Historically FPGAs have been slower less energy efficient and generally

                                                                                achieved less functionality than their fixed ASIC counterparts A combination

                                                                                of volume fabrication improvements research and development and the IO

                                                                                capabilities of new supercomputers have largely closed the performance gap

                                                                                between ASICs and FPGAs

                                                                                Advantages include a shorter time to market ability to re-program in the field

                                                                                to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                middle road by developing their hardware on ordinary FPGAs but

                                                                                manufacture their final version so it can no longer be modified after the design

                                                                                has been committed

                                                                                Xilinx claims that several market and technology dynamics are changing the

                                                                                ASICFPGA paradigm

                                                                                IC costs are rising aggressively

                                                                                ASIC complexity has bolstered development time and costs

                                                                                RampD resources and headcount is decreasing

                                                                                Revenue losses for slow time-to-market are increasing

                                                                                Financial constraints in a poor economy are driving low-cost technologies

                                                                                These trends make FPGAs a better alternative than ASICs for a growing

                                                                                number of higher-volume applications than they have been historically used

                                                                                for which the company blames for the growing number of FPGA design starts

                                                                                (see History)

                                                                                The primary differences between CPLDs and FPGAs are architectural A

                                                                                CPLD has a somewhat restrictive structure consisting of one or more

                                                                                programmable sum-of-products logic arrays feeding a relatively small number

                                                                                of clocked registers The result of this is less flexibility with the advantage of

                                                                                more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                FPGA architectures on the other hand are dominated by interconnect This

                                                                                56

                                                                                makes them far more flexible (in terms of the range of designs that are

                                                                                practical for implementation within them) but also far more complex to design

                                                                                for

                                                                                Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                and embedded memories as well as to have logic blocks implement decoders

                                                                                or mathematical functions

                                                                                Some FPGAs have the capability of partial re-configuration that lets one

                                                                                portion of the device be re-programmed while other portions continue running

                                                                                123 Applications

                                                                                Applications of FPGAs include digital signal processing software-defined

                                                                                radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                computer vision speech recognition cryptography bioinformatics computer

                                                                                hardware emulation radio astronomy and a growing range of other areas

                                                                                FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                space that of glue logic for PCBs As their size capabilities and speed

                                                                                increased they began to take over larger and larger functions to the state where

                                                                                some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                applications which had traditionally been the sole reserve of DSPs began to

                                                                                incorporate FPGAs instead

                                                                                FPGAs especially find applications in any area or algorithm that can make use

                                                                                of the massive parallelism offered by their architecture One such area is code

                                                                                breaking in particular brute-force attack of cryptographic algorithms

                                                                                FPGAs are increasingly used in conventional high performance computing

                                                                                applications where computational kernels such as FFT or Convolution are

                                                                                performed on the FPGA instead of a microprocessor

                                                                                57

                                                                                The inherent parallelism of the logic resources on an FPGA allows for

                                                                                considerable computational throughput even at a low MHz clock rates The

                                                                                flexibility of the FPGA allows for even higher performance by trading off

                                                                                precision and range in the number format for an increased number of parallel

                                                                                arithmetic units This has driven a new type of processing called

                                                                                reconfigurable computing where time intensive tasks are offloaded from

                                                                                software to FPGAs

                                                                                The adoption of FPGAs in high performance computing is currently limited by

                                                                                the complexity of FPGA design compared to conventional software and the

                                                                                extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                is necessary after even minor changes to the source code

                                                                                Traditionally FPGAs have been reserved for specific vertical applications

                                                                                where the volume of production is small For these low-volume applications

                                                                                the premium that companies pay in hardware costs per unit for a

                                                                                programmable chip is more affordable than the development resources spent

                                                                                on creating an ASIC for a low-volume application Today new cost and

                                                                                performance dynamics have broadened the range of viable applications

                                                                                Architecture

                                                                                The most common FPGA architecture consists of an array of configurable

                                                                                logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                channels have the same width (number of wires) Multiple IO pads may fit

                                                                                into the height of one row or the width of one column in the array

                                                                                An application circuit must be mapped into an FPGA with adequate resources

                                                                                While the number of CLBs and IOs required is easily determined from the

                                                                                design the number of routing tracks needed may vary considerably even

                                                                                among designs with the same amount of logic (For example a crossbar switch

                                                                                requires much more routing than a systolic array with the same gate count)

                                                                                Since unused routing tracks increase the cost (and decrease the performance)

                                                                                of the part without providing any benefit FPGA manufacturers try to provide

                                                                                58

                                                                                just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                can be routed This is determined by estimates such as those derived from

                                                                                Rents rule or by experiments with existing designs

                                                                                The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                width of one column as shown below All the routing channels have the same

                                                                                width (number of wires)

                                                                                121 FPGA structure

                                                                                A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                flip-flop as shown below In recent years manufacturers have started moving

                                                                                to 6-input LUTs in their high performance parts claiming increased

                                                                                performance

                                                                                Typical logic block

                                                                                59

                                                                                There is only one output which can be either the registered or the unregistered

                                                                                LUT output The logic block has four inputs for the LUT and a clock input

                                                                                Since clock signals (and often other high-fanout signals) are normally routed

                                                                                via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                and other signals are separately managed

                                                                                For this example architecture the locations of the FPGA logic block pins are

                                                                                shown below

                                                                                Logic Block Pin Locations

                                                                                Each input is accessible from one side of the logic block while the output pin

                                                                                can connect to routing wires in both the channel to the right and the channel

                                                                                below the logic block

                                                                                Each logic block output pin can connect to any of the wiring segments in the

                                                                                channels adjacent to it

                                                                                Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                channel adjacent to it For example an IO pad at the top of the chip can

                                                                                connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                channel immediately below it

                                                                                Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                spans only one logic block before it terminates in a switch box By turning on

                                                                                60

                                                                                some of the programmable switches within a switch box longer paths can be

                                                                                constructed For higher speed interconnect some FPGA architectures use

                                                                                longer routing lines that span multiple logic blocks

                                                                                Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                this architecture when a wire enters a switch box there are three

                                                                                programmable switches that allow it to connect to three other wires in adjacent

                                                                                channel segments The pattern or topology of switches used in this

                                                                                architecture is the planar or domain-based switch box topology In this switch

                                                                                box topology a wire in track number one connects only to wires in track

                                                                                number one in adjacent channel segments wires in track number 2 connect

                                                                                only to other wires in track number 2 and so on The figure below illustrates

                                                                                the connections in a switch box

                                                                                Switch box topology

                                                                                Modern FPGA families expand upon the above capabilities to include higher

                                                                                level functionality fixed into the silicon Having these common functions

                                                                                embedded into the silicon reduces the area required and gives those functions

                                                                                increased speed compared to building them from primitives Examples of these

                                                                                include multipliers generic DSP blocks embedded processors high speed IO

                                                                                logic and embedded memories

                                                                                61

                                                                                FPGAs are also widely used for systems validation including pre-silicon

                                                                                validation post-silicon validation and firmware development This allows chip

                                                                                companies to validate their design before the chip is produced in the factory

                                                                                reducing the time to market

                                                                                124 FPGA Design and Programming

                                                                                To define the behavior of the FPGA the user provides a hardware description

                                                                                language (HDL) or a schematic design The HDL form might be easier to work

                                                                                with when handling large structures because its possible to just specify them

                                                                                numerically rather than having to draw every piece by hand On the other hand

                                                                                schematic entry can allow for easier visualisation of a design

                                                                                Then using an electronic design automation tool a technology-mapped netlist

                                                                                is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                using a process called place-and-route usually performed by the FPGA

                                                                                companys proprietary place-and-route software The user will validate the

                                                                                map place and route results via timing analysis simulation and other

                                                                                verification methodologies Once the design and validation process is

                                                                                complete the binary file generated (also using the FPGA companys

                                                                                proprietary software) is used to (re)configure the FPGA

                                                                                Going from schematicHDL source files to actual configuration The source

                                                                                files are fed to a software suite from the FPGACPLD vendor that through

                                                                                different steps will produce a file This file is then transferred to the

                                                                                FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                like an EEPROM

                                                                                The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                reduce the complexity of designing in HDLs which have been compared to the

                                                                                equivalent of assembly languages there are moves to raise the abstraction level

                                                                                through the introduction of alternative languages

                                                                                62

                                                                                To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                predefined complex functions and circuits that have been tested and optimized

                                                                                to speed up the design process These predefined circuits are commonly called

                                                                                IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                (rarely free and typically released under proprietary licenses) Other

                                                                                predefined circuits are available from developer communities such as Open

                                                                                Cores (typically free and released under the GPL BSD or similar license) and

                                                                                other sources

                                                                                In a typical design flow an FPGA application developer will simulate the

                                                                                design at multiple stages throughout the design process Initially the RTL

                                                                                description in VHDL or Verilog is simulated by creating test benches to

                                                                                simulate the system and observe results Then after the synthesis engine has

                                                                                mapped the design to a netlist the netlist is translated to a gate level

                                                                                description where simulation is repeated to confirm the synthesis proceeded

                                                                                without errors Finally the design is laid out in the FPGA at which point

                                                                                propagation delays can be added and the simulation run again with these

                                                                                values back-annotated onto the netlist

                                                                                Basic Process Technology Types

                                                                                SRAM - based on static memory technology In-system programmable and re-

                                                                                programmable Requires external boot devices CMOS

                                                                                Antifuse - One-time programmable CMOS

                                                                                EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                one-time programmable in production because of plastic packaging

                                                                                Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                technology Can be erased even in plastic packages Some but not all

                                                                                EEPROM devices can be in-system programmed CMOS

                                                                                Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                packages Some but not all flash devices can be in-system programmed

                                                                                63

                                                                                Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                therefore less expensive to manufacture CMOS

                                                                                Fuse - One-time programmable Bipolar

                                                                                Major Manufacturers

                                                                                Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                representing over 50 percent

                                                                                Xilinx also provides free Windows and Linux design software while Altera

                                                                                provides free Windows tools the Solaris and Linux tools are only available via

                                                                                a rental scheme

                                                                                Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                focused CSSP no general purpose FPGAs)

                                                                                125 FPGA prototype

                                                                                FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                hardware verification and early software development

                                                                                Main stream verification methods for hardware design and early software and

                                                                                firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                design on FPGA has become a good method to do this

                                                                                64

                                                                                Reasons why Prototyping is important

                                                                                1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                functionally correct This is compared to designers only relying on software

                                                                                simulations to verify that their hardware design is sound Simulation speed and

                                                                                modeling accuracy limitations hinder this development

                                                                                2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                the foundry to start on software tests FPGA prototyping allows for much more

                                                                                time in area of software development and testing at the software-hardware

                                                                                integration stage This allows many unforeseen software bugs that appear due

                                                                                to todays array of operating systems applications and hardware

                                                                                3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                system work well together off the simulation stage and in actual form

                                                                                4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                bringing in interest early This speeds up the overall development cycle and

                                                                                allows for more enhancement or improvement to the chip features as it would

                                                                                otherwise have been

                                                                                65

                                                                                13 SIMULATION RESULTS

                                                                                66

                                                                                GATE LEVEL

                                                                                FigVGA CONTROLLER

                                                                                67

                                                                                GATE LEVEL

                                                                                Fig SYNC UNIT

                                                                                68

                                                                                TECHNOLOGY SCHEMATIC

                                                                                69

                                                                                14 APPLICATIONS

                                                                                Motion Tracking Systems

                                                                                Image processing systems

                                                                                Displaying Systems

                                                                                141 CONCLUTION

                                                                                Hence by implementing FPGA taking a processor-generated picture

                                                                                (pixilated) from memory space the Controller provides digital RGB values for

                                                                                each pixel as well as horizontal and vertical synchronization signals in order

                                                                                to correctly display the picture on a connected monitor

                                                                                70

                                                                                REFERENCES

                                                                                1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                2 A VGA display controller by Eduardo Sanchez

                                                                                3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                71

                                                                                • Pixel
                                                                                  • 6 SIGNALS IN VGA CONTROLLER
                                                                                  • Gates
                                                                                  • Market size
                                                                                  • 122 FPGA Comparisons

                                                                                  Process keyword in VHDL

                                                                                  Process statement is concurrent statement

                                                                                  Statements inside process statements are sequential statements

                                                                                  Process must contain either sensitivity list or wait statement(s) but NOT both

                                                                                  Sensitivity list or wait statement(s) contains signals which wakes process up

                                                                                  General format

                                                                                  Process [(sensitivity_list)]

                                                                                  process_declarative_part

                                                                                  Begin

                                                                                  process_statements

                                                                                  [wait_statement]

                                                                                  End process

                                                                                  41

                                                                                  11 VGA CONTROLLER CODE

                                                                                  library IEEE

                                                                                  use IEEESTD_LOGIC_1164ALL

                                                                                  use IEEESTD_LOGIC_ARITHALL

                                                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                  use IEEEnumeric_stdALL

                                                                                  ---- Uncomment the following library declaration if instantiating

                                                                                  ---- any Xilinx primitives in this code

                                                                                  --library UNISIM

                                                                                  --use UNISIMVComponentsall

                                                                                  entity vga_controller is port(clk50min std_logic

                                                                                  rstnin std_logic

                                                                                  hsyncout std_logic

                                                                                  vsyncout std_logic

                                                                                  redout std_logic_vector(0 downto 0)

                                                                                  greenout std_logic_vector(0 downto 0)

                                                                                  blueout std_logic_vector(0 downto 0))

                                                                                  end vga_controller

                                                                                  architecture Behavioral of vga_controller is

                                                                                  component memory

                                                                                  port (

                                                                                  addr IN std_logic_VECTOR(17 downto 0)

                                                                                  clk IN std_logic

                                                                                  dout OUT std_logic_VECTOR(0 downto 0)

                                                                                  en IN std_logic

                                                                                  sinit IN std_logic)

                                                                                  END component

                                                                                  42

                                                                                  component sync_unit port(

                                                                                  clkin std_logic

                                                                                  rstnin std_logic

                                                                                  hsyncout std_logic

                                                                                  ---horizontal synch pulse(row)

                                                                                  vsyncout std_logic

                                                                                  --vertical sync pulse(frame) 111110100000000000

                                                                                  v_en_regout std_logic

                                                                                  h_enout std_logic

                                                                                  v_enout std_logic

                                                                                  line_countinout integer

                                                                                  c_hsinout integer)

                                                                                  end component

                                                                                  component pixelgen port(

                                                                                  clkin std_logic

                                                                                  rstnin std_logic

                                                                                  hsyncin std_logic

                                                                                  vsyncin std_logic

                                                                                  h_enin std_logic

                                                                                  v_enin std_logic

                                                                                  v_en_reg in std_logic

                                                                                  data_inin std_logic_vector(0 downto 0)

                                                                                  line_countin integer

                                                                                  c_hsin integer

                                                                                  red out std_logic_vector(0 downto 0)

                                                                                  greenout std_logic_vector(0 downto 0)

                                                                                  blueout std_logic_vector(0 downto 0)

                                                                                  addrout std_logic_vector(17 downto 0)

                                                                                  rdout std_logic)

                                                                                  end component

                                                                                  43

                                                                                  signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                                  signal line_count_sc_hs_saddrinteger

                                                                                  signal data_sstd_logic_vector(0 downto 0)

                                                                                  signal addr_sstd_logic_vector(17 downto 0)

                                                                                  begin

                                                                                  --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                                  process(clk50mrstn)

                                                                                  begin

                                                                                  if(rstn = 0)then

                                                                                  clk lt= 0

                                                                                  elsif(clk50m = 1 and clk50mevent)then

                                                                                  clk lt= not clk

                                                                                  end if

                                                                                  --end if

                                                                                  end process

                                                                                  hsync lt= hsync_s

                                                                                  vsync lt= vsync_s

                                                                                  sync sync_unit port map(

                                                                                  clk =gt clk

                                                                                  rstn =gt rstn

                                                                                  hsync =gt hsync_s

                                                                                  vsync =gt vsync_s

                                                                                  v_en_reg =gt v_en_reg_s

                                                                                  line_count =gt line_count_s

                                                                                  h_en =gt h_en_s

                                                                                  v_en =gt v_en_s

                                                                                  c_hs =gt c_hs_s

                                                                                  )

                                                                                  44

                                                                                  pixel pixelgen port map(

                                                                                  clk =gt clk

                                                                                  rstn =gt rstn

                                                                                  hsync =gt hsync_s

                                                                                  vsync =gt vsync_s

                                                                                  h_en =gt h_en_s

                                                                                  v_en =gt v_en_s

                                                                                  v_en_reg =gt v_en_reg_s

                                                                                  data_in =gt data_s

                                                                                  line_count=gt line_count_s

                                                                                  c_hs =gt c_hs_s

                                                                                  red =gt red

                                                                                  green =gt green

                                                                                  blue =gt blue

                                                                                  addr =gt addr_s

                                                                                  rd =gt rd_s

                                                                                  )

                                                                                  mem12 memory port map(

                                                                                  addr =gt addr_s

                                                                                  clk =gt clk

                                                                                  dout =gt data_s

                                                                                  en =gt rd_s

                                                                                  sinit=gt rstn

                                                                                  )

                                                                                  end Behavioral

                                                                                  45

                                                                                  SYNCHRONIZATION BLOCK CODE

                                                                                  library IEEE

                                                                                  use IEEESTD_LOGIC_1164ALL

                                                                                  use IEEESTD_LOGIC_ARITHALL

                                                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                  entity sync_unit is port(

                                                                                  clkin std_logic

                                                                                  rstnin std_logic

                                                                                  hsyncout std_logic ---horizontal synch pulse(row)

                                                                                  vsyncout std_logic --vertical sync

                                                                                  pulse(frame) 111110100000000000

                                                                                  v_en_regout std_logic

                                                                                  h_enout std_logic

                                                                                  v_enout std_logic

                                                                                  line_countout integer

                                                                                  c_hsout integer )

                                                                                  end sync_unit

                                                                                  architecture arch_sync_unit of sync_unit is

                                                                                  signal h_en_sv_en_sstd_logic

                                                                                  signal line_count_sc_hs_sinteger

                                                                                  begin

                                                                                  c_hs lt= c_hs_s

                                                                                  line_count lt= line_count_s

                                                                                  h_en lt= h_en_s

                                                                                  v_en lt= v_en_s

                                                                                  -- hsync counter

                                                                                  process(clkrstn)

                                                                                  begin

                                                                                  if(rstn = 0)then

                                                                                  46

                                                                                  c_hs_s lt= 0

                                                                                  else

                                                                                  if(clk = 1 and clkevent)then

                                                                                  if(c_hs_s=793)then

                                                                                  c_hs_s lt= 0

                                                                                  else

                                                                                  c_hs_s lt= c_hs_s +1

                                                                                  end if

                                                                                  end if

                                                                                  end if

                                                                                  end process

                                                                                  --vsync counter

                                                                                  --vertical line counter

                                                                                  process(h_en_srstn)

                                                                                  begin

                                                                                  if(rstn = 0)then

                                                                                  line_count_s lt= 0

                                                                                  elsif(h_en_s = 1 and h_en_sevent)then

                                                                                  line_count_s lt= line_count_s + 1

                                                                                  if (line_count_s=515)then

                                                                                  line_count_s lt= 0

                                                                                  end if

                                                                                  end if

                                                                                  --end if

                                                                                  end process

                                                                                  --hysnc pulse

                                                                                  process(clkrstn)

                                                                                  begin

                                                                                  if(rstn = 0)then

                                                                                  47

                                                                                  hsync lt= 1

                                                                                  elsif(clk = 1 and clkevent)then

                                                                                  if(c_hs_s lt=95)then

                                                                                  hsync lt= 0

                                                                                  else

                                                                                  hsync lt= 1

                                                                                  end if

                                                                                  end if

                                                                                  --end if

                                                                                  end process

                                                                                  process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                                  this period)

                                                                                  begin

                                                                                  if(rstn = 0)then

                                                                                  h_en_s lt= 0

                                                                                  elsif(clk = 1 and clkevent)then

                                                                                  if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                                  h_en_s lt= 1

                                                                                  else

                                                                                  h_en_s lt= 0

                                                                                  end if

                                                                                  end if

                                                                                  --end if

                                                                                  end process

                                                                                  ---vysnc pulse

                                                                                  process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                                  begin

                                                                                  if(rstn = 0)then

                                                                                  48

                                                                                  v_en_s lt= 0

                                                                                  elsif(clk = 1 and clkevent)then

                                                                                  if((line_count_sgt=35 and line_count_slt=515))then

                                                                                  v_en_s lt= 1

                                                                                  else

                                                                                  v_en_s lt= 0

                                                                                  end if

                                                                                  end if

                                                                                  --end if

                                                                                  end process

                                                                                  process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                  begin

                                                                                  if(rstn = 0)then

                                                                                  vsync lt= 1

                                                                                  elsif(clk = 1 and clkevent)then

                                                                                  if(line_count_s lt2)then

                                                                                  vsync lt= 0

                                                                                  else

                                                                                  vsync lt= 1

                                                                                  end if

                                                                                  end if

                                                                                  --end if

                                                                                  end process

                                                                                  process(clk)

                                                                                  begin

                                                                                  if(clk = 1 and clkevent)then

                                                                                  v_en_reglt=v_en_s

                                                                                  end if

                                                                                  end process

                                                                                  end arch_sync_unit

                                                                                  49

                                                                                  PIXEL GENERATOR

                                                                                  library IEEE

                                                                                  use IEEESTD_LOGIC_1164ALL

                                                                                  use IEEESTD_LOGIC_ARITHALL

                                                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                  entity pixelgen is port(

                                                                                  clkin std_logic

                                                                                  rstnin std_logic

                                                                                  hsyncin std_logic

                                                                                  vsyncin std_logic

                                                                                  h_enin std_logic

                                                                                  v_enin std_logic

                                                                                  v_en_reg in std_logic

                                                                                  data_inin std_logic_vector(0 downto 0)

                                                                                  line_countin integer

                                                                                  c_hsin integer

                                                                                  red out std_logic_vector(0 downto 0)

                                                                                  greenout std_logic_vector(0 downto 0)

                                                                                  blueout std_logic_vector(0 downto 0)

                                                                                  addrout std_logic_vector(17 downto 0)

                                                                                  rdout std_logic

                                                                                  )

                                                                                  end pixelgen

                                                                                  architecture behv of pixelgen is

                                                                                  signal addr_sstd_logic_vector(17 downto 0)

                                                                                  begin

                                                                                  addr lt= addr_s

                                                                                  process(clkrstn)

                                                                                  begin

                                                                                  50

                                                                                  if(rstn = 0)then

                                                                                  rd lt= 0

                                                                                  red lt= 0

                                                                                  green lt= 0

                                                                                  blue lt= 0

                                                                                  addr_s lt= x0000 amp 00

                                                                                  elsif (clk = 1 and clkevent)then

                                                                                  if((v_en=1) and (v_en_reg=0))then

                                                                                  addr_slt=(others =gt0)

                                                                                  end if

                                                                                  if(v_en=1 and h_en=1)then

                                                                                  if(line_countlt=514)then

                                                                                  if(c_hs lt= 655)then

                                                                                  rd lt= 1

                                                                                  red lt= data_in

                                                                                  green lt= data_in

                                                                                  blue lt= data_in

                                                                                  addr_s lt=addr_s +000000000000000001

                                                                                  else

                                                                                  rd lt= 0

                                                                                  red lt= 0

                                                                                  green lt= 1

                                                                                  blue lt= 0

                                                                                  end if

                                                                                  else

                                                                                  rd lt= 0

                                                                                  red lt= 1

                                                                                  green lt= 1

                                                                                  blue lt= 1

                                                                                  51

                                                                                  end if

                                                                                  if(addr_s=111011111111111111)then

                                                                                  addr_slt= (others =gt0)

                                                                                  end if

                                                                                  else

                                                                                  red lt= 0

                                                                                  green lt= 0

                                                                                  blue lt= 0

                                                                                  end if

                                                                                  end if

                                                                                  -- end if

                                                                                  --end if

                                                                                  end process

                                                                                  end behv

                                                                                  52

                                                                                  12 FPGA ARCHITECTURE

                                                                                  A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                  be configured by the customer or designer after manufacturingmdashhence the

                                                                                  name field-programmable FPGAs are programmed using a logic circuit

                                                                                  diagram or a source code in a hardware description language (HDL) to specify

                                                                                  how the chip will work They can be used to implement any logical function

                                                                                  that an application-specific integrated circuit (ASIC) could perform but the

                                                                                  ability to update the functionality after shipping offers advantages for many

                                                                                  applications

                                                                                  FPGAs contain programmable logic components called logic blocks and a

                                                                                  hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                  togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                  can be configured to perform complex combinational functions or merely

                                                                                  simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                  include memory elements which may be simple flip-flops or more complete

                                                                                  blocks of memory

                                                                                  History The FPGA industry sprouted from programmable read only

                                                                                  memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                  both had the option of being programmed in batches in a factory or in the field

                                                                                  (field programmable) however programmable logic was hard-wired between

                                                                                  logic gates

                                                                                  Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                  first commercially viable field programmable gate array in 1985 ndash the

                                                                                  XC2064 The XC2064 had programmable gates and programmable

                                                                                  interconnects between gates the beginnings of a new technology and market

                                                                                  The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                  input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                  into the National Inventors Hall of Fame for his invention

                                                                                  53

                                                                                  Some of the industryrsquos foundational concepts and technologies for

                                                                                  programmable logic arrays gates and logic blocks are founded in patents

                                                                                  awarded to David W Page and LuVerne R Peterson in 1985

                                                                                  In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                  proposed by Steve Casselman to develop a computer that would implement

                                                                                  600000 reprogrammable gates Casselman was successful and the system was

                                                                                  awarded a patent in 1992

                                                                                  Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                  1990s when competitors sprouted up eroding significant market-share By

                                                                                  1993 Actel was serving about 18 percent of the market

                                                                                  The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                  and the volume of production In the early 1990s FPGAs were primarily used

                                                                                  in telecommunications and networking By the end of the decade FPGAs

                                                                                  found their way into consumer automotive and industrial applications

                                                                                  FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                  algorithm technology and FPGAs to create a sound recognition device

                                                                                  Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                  to decide the configuration needed to accomplish a sound recognition task

                                                                                  121 Modern developments

                                                                                  A recent trend has been to take the coarse-grained architectural

                                                                                  approach a step further by combining the logic blocks and interconnects of

                                                                                  traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                  form a complete system on a programmable chip This work mirrors the

                                                                                  architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                  Group which combined a reconfigurable CPU architecture on a single chip

                                                                                  called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                  technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                  which include one or more PowerPC processors embedded within the FPGAs

                                                                                  54

                                                                                  logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                  processor in combination with Atmels programmable logic architecture

                                                                                  An alternate approach to using hard-macro processors is to make use of soft

                                                                                  processor cores that are implemented within the FPGA logic (See Soft

                                                                                  processors below)

                                                                                  As previously mentioned many modern FPGAs have the ability to be

                                                                                  reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                  computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                  suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                  example of a reconfigurable soft processor implemented on FPGAs However

                                                                                  it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                  to a specific program

                                                                                  Additionally new non-FPGA architectures are beginning to emerge

                                                                                  Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                  hybrid approach by providing an array of processor cores and FPGA-like

                                                                                  programmable cores on the same chip

                                                                                  Gates

                                                                                  1987 9000 gates Xilinx

                                                                                  1992 600000 Naval Surface Warfare Department

                                                                                  Early 2000s Millions

                                                                                  Market size

                                                                                  1985 First commercial FPGA technology invented by Xilinx

                                                                                  1987 $14 million

                                                                                  ~1993 gt$385 million

                                                                                  2005 $19 billion

                                                                                  2010 estimates $275 billion

                                                                                  55

                                                                                  122 FPGA Comparisons

                                                                                  Historically FPGAs have been slower less energy efficient and generally

                                                                                  achieved less functionality than their fixed ASIC counterparts A combination

                                                                                  of volume fabrication improvements research and development and the IO

                                                                                  capabilities of new supercomputers have largely closed the performance gap

                                                                                  between ASICs and FPGAs

                                                                                  Advantages include a shorter time to market ability to re-program in the field

                                                                                  to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                  middle road by developing their hardware on ordinary FPGAs but

                                                                                  manufacture their final version so it can no longer be modified after the design

                                                                                  has been committed

                                                                                  Xilinx claims that several market and technology dynamics are changing the

                                                                                  ASICFPGA paradigm

                                                                                  IC costs are rising aggressively

                                                                                  ASIC complexity has bolstered development time and costs

                                                                                  RampD resources and headcount is decreasing

                                                                                  Revenue losses for slow time-to-market are increasing

                                                                                  Financial constraints in a poor economy are driving low-cost technologies

                                                                                  These trends make FPGAs a better alternative than ASICs for a growing

                                                                                  number of higher-volume applications than they have been historically used

                                                                                  for which the company blames for the growing number of FPGA design starts

                                                                                  (see History)

                                                                                  The primary differences between CPLDs and FPGAs are architectural A

                                                                                  CPLD has a somewhat restrictive structure consisting of one or more

                                                                                  programmable sum-of-products logic arrays feeding a relatively small number

                                                                                  of clocked registers The result of this is less flexibility with the advantage of

                                                                                  more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                  FPGA architectures on the other hand are dominated by interconnect This

                                                                                  56

                                                                                  makes them far more flexible (in terms of the range of designs that are

                                                                                  practical for implementation within them) but also far more complex to design

                                                                                  for

                                                                                  Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                  FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                  and embedded memories as well as to have logic blocks implement decoders

                                                                                  or mathematical functions

                                                                                  Some FPGAs have the capability of partial re-configuration that lets one

                                                                                  portion of the device be re-programmed while other portions continue running

                                                                                  123 Applications

                                                                                  Applications of FPGAs include digital signal processing software-defined

                                                                                  radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                  computer vision speech recognition cryptography bioinformatics computer

                                                                                  hardware emulation radio astronomy and a growing range of other areas

                                                                                  FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                  space that of glue logic for PCBs As their size capabilities and speed

                                                                                  increased they began to take over larger and larger functions to the state where

                                                                                  some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                  introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                  applications which had traditionally been the sole reserve of DSPs began to

                                                                                  incorporate FPGAs instead

                                                                                  FPGAs especially find applications in any area or algorithm that can make use

                                                                                  of the massive parallelism offered by their architecture One such area is code

                                                                                  breaking in particular brute-force attack of cryptographic algorithms

                                                                                  FPGAs are increasingly used in conventional high performance computing

                                                                                  applications where computational kernels such as FFT or Convolution are

                                                                                  performed on the FPGA instead of a microprocessor

                                                                                  57

                                                                                  The inherent parallelism of the logic resources on an FPGA allows for

                                                                                  considerable computational throughput even at a low MHz clock rates The

                                                                                  flexibility of the FPGA allows for even higher performance by trading off

                                                                                  precision and range in the number format for an increased number of parallel

                                                                                  arithmetic units This has driven a new type of processing called

                                                                                  reconfigurable computing where time intensive tasks are offloaded from

                                                                                  software to FPGAs

                                                                                  The adoption of FPGAs in high performance computing is currently limited by

                                                                                  the complexity of FPGA design compared to conventional software and the

                                                                                  extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                  is necessary after even minor changes to the source code

                                                                                  Traditionally FPGAs have been reserved for specific vertical applications

                                                                                  where the volume of production is small For these low-volume applications

                                                                                  the premium that companies pay in hardware costs per unit for a

                                                                                  programmable chip is more affordable than the development resources spent

                                                                                  on creating an ASIC for a low-volume application Today new cost and

                                                                                  performance dynamics have broadened the range of viable applications

                                                                                  Architecture

                                                                                  The most common FPGA architecture consists of an array of configurable

                                                                                  logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                  channels have the same width (number of wires) Multiple IO pads may fit

                                                                                  into the height of one row or the width of one column in the array

                                                                                  An application circuit must be mapped into an FPGA with adequate resources

                                                                                  While the number of CLBs and IOs required is easily determined from the

                                                                                  design the number of routing tracks needed may vary considerably even

                                                                                  among designs with the same amount of logic (For example a crossbar switch

                                                                                  requires much more routing than a systolic array with the same gate count)

                                                                                  Since unused routing tracks increase the cost (and decrease the performance)

                                                                                  of the part without providing any benefit FPGA manufacturers try to provide

                                                                                  58

                                                                                  just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                  can be routed This is determined by estimates such as those derived from

                                                                                  Rents rule or by experiments with existing designs

                                                                                  The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                  blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                  width of one column as shown below All the routing channels have the same

                                                                                  width (number of wires)

                                                                                  121 FPGA structure

                                                                                  A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                  flip-flop as shown below In recent years manufacturers have started moving

                                                                                  to 6-input LUTs in their high performance parts claiming increased

                                                                                  performance

                                                                                  Typical logic block

                                                                                  59

                                                                                  There is only one output which can be either the registered or the unregistered

                                                                                  LUT output The logic block has four inputs for the LUT and a clock input

                                                                                  Since clock signals (and often other high-fanout signals) are normally routed

                                                                                  via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                  and other signals are separately managed

                                                                                  For this example architecture the locations of the FPGA logic block pins are

                                                                                  shown below

                                                                                  Logic Block Pin Locations

                                                                                  Each input is accessible from one side of the logic block while the output pin

                                                                                  can connect to routing wires in both the channel to the right and the channel

                                                                                  below the logic block

                                                                                  Each logic block output pin can connect to any of the wiring segments in the

                                                                                  channels adjacent to it

                                                                                  Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                  channel adjacent to it For example an IO pad at the top of the chip can

                                                                                  connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                  channel immediately below it

                                                                                  Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                  spans only one logic block before it terminates in a switch box By turning on

                                                                                  60

                                                                                  some of the programmable switches within a switch box longer paths can be

                                                                                  constructed For higher speed interconnect some FPGA architectures use

                                                                                  longer routing lines that span multiple logic blocks

                                                                                  Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                  this architecture when a wire enters a switch box there are three

                                                                                  programmable switches that allow it to connect to three other wires in adjacent

                                                                                  channel segments The pattern or topology of switches used in this

                                                                                  architecture is the planar or domain-based switch box topology In this switch

                                                                                  box topology a wire in track number one connects only to wires in track

                                                                                  number one in adjacent channel segments wires in track number 2 connect

                                                                                  only to other wires in track number 2 and so on The figure below illustrates

                                                                                  the connections in a switch box

                                                                                  Switch box topology

                                                                                  Modern FPGA families expand upon the above capabilities to include higher

                                                                                  level functionality fixed into the silicon Having these common functions

                                                                                  embedded into the silicon reduces the area required and gives those functions

                                                                                  increased speed compared to building them from primitives Examples of these

                                                                                  include multipliers generic DSP blocks embedded processors high speed IO

                                                                                  logic and embedded memories

                                                                                  61

                                                                                  FPGAs are also widely used for systems validation including pre-silicon

                                                                                  validation post-silicon validation and firmware development This allows chip

                                                                                  companies to validate their design before the chip is produced in the factory

                                                                                  reducing the time to market

                                                                                  124 FPGA Design and Programming

                                                                                  To define the behavior of the FPGA the user provides a hardware description

                                                                                  language (HDL) or a schematic design The HDL form might be easier to work

                                                                                  with when handling large structures because its possible to just specify them

                                                                                  numerically rather than having to draw every piece by hand On the other hand

                                                                                  schematic entry can allow for easier visualisation of a design

                                                                                  Then using an electronic design automation tool a technology-mapped netlist

                                                                                  is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                  using a process called place-and-route usually performed by the FPGA

                                                                                  companys proprietary place-and-route software The user will validate the

                                                                                  map place and route results via timing analysis simulation and other

                                                                                  verification methodologies Once the design and validation process is

                                                                                  complete the binary file generated (also using the FPGA companys

                                                                                  proprietary software) is used to (re)configure the FPGA

                                                                                  Going from schematicHDL source files to actual configuration The source

                                                                                  files are fed to a software suite from the FPGACPLD vendor that through

                                                                                  different steps will produce a file This file is then transferred to the

                                                                                  FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                  like an EEPROM

                                                                                  The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                  reduce the complexity of designing in HDLs which have been compared to the

                                                                                  equivalent of assembly languages there are moves to raise the abstraction level

                                                                                  through the introduction of alternative languages

                                                                                  62

                                                                                  To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                  predefined complex functions and circuits that have been tested and optimized

                                                                                  to speed up the design process These predefined circuits are commonly called

                                                                                  IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                  (rarely free and typically released under proprietary licenses) Other

                                                                                  predefined circuits are available from developer communities such as Open

                                                                                  Cores (typically free and released under the GPL BSD or similar license) and

                                                                                  other sources

                                                                                  In a typical design flow an FPGA application developer will simulate the

                                                                                  design at multiple stages throughout the design process Initially the RTL

                                                                                  description in VHDL or Verilog is simulated by creating test benches to

                                                                                  simulate the system and observe results Then after the synthesis engine has

                                                                                  mapped the design to a netlist the netlist is translated to a gate level

                                                                                  description where simulation is repeated to confirm the synthesis proceeded

                                                                                  without errors Finally the design is laid out in the FPGA at which point

                                                                                  propagation delays can be added and the simulation run again with these

                                                                                  values back-annotated onto the netlist

                                                                                  Basic Process Technology Types

                                                                                  SRAM - based on static memory technology In-system programmable and re-

                                                                                  programmable Requires external boot devices CMOS

                                                                                  Antifuse - One-time programmable CMOS

                                                                                  EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                  one-time programmable in production because of plastic packaging

                                                                                  Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                  EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                  technology Can be erased even in plastic packages Some but not all

                                                                                  EEPROM devices can be in-system programmed CMOS

                                                                                  Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                  packages Some but not all flash devices can be in-system programmed

                                                                                  63

                                                                                  Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                  therefore less expensive to manufacture CMOS

                                                                                  Fuse - One-time programmable Bipolar

                                                                                  Major Manufacturers

                                                                                  Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                  rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                  representing over 50 percent

                                                                                  Xilinx also provides free Windows and Linux design software while Altera

                                                                                  provides free Windows tools the Solaris and Linux tools are only available via

                                                                                  a rental scheme

                                                                                  Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                  (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                  Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                  focused CSSP no general purpose FPGAs)

                                                                                  125 FPGA prototype

                                                                                  FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                  prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                  hardware verification and early software development

                                                                                  Main stream verification methods for hardware design and early software and

                                                                                  firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                  design on FPGA has become a good method to do this

                                                                                  64

                                                                                  Reasons why Prototyping is important

                                                                                  1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                  functionally correct This is compared to designers only relying on software

                                                                                  simulations to verify that their hardware design is sound Simulation speed and

                                                                                  modeling accuracy limitations hinder this development

                                                                                  2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                  the foundry to start on software tests FPGA prototyping allows for much more

                                                                                  time in area of software development and testing at the software-hardware

                                                                                  integration stage This allows many unforeseen software bugs that appear due

                                                                                  to todays array of operating systems applications and hardware

                                                                                  3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                  system work well together off the simulation stage and in actual form

                                                                                  4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                  bringing in interest early This speeds up the overall development cycle and

                                                                                  allows for more enhancement or improvement to the chip features as it would

                                                                                  otherwise have been

                                                                                  65

                                                                                  13 SIMULATION RESULTS

                                                                                  66

                                                                                  GATE LEVEL

                                                                                  FigVGA CONTROLLER

                                                                                  67

                                                                                  GATE LEVEL

                                                                                  Fig SYNC UNIT

                                                                                  68

                                                                                  TECHNOLOGY SCHEMATIC

                                                                                  69

                                                                                  14 APPLICATIONS

                                                                                  Motion Tracking Systems

                                                                                  Image processing systems

                                                                                  Displaying Systems

                                                                                  141 CONCLUTION

                                                                                  Hence by implementing FPGA taking a processor-generated picture

                                                                                  (pixilated) from memory space the Controller provides digital RGB values for

                                                                                  each pixel as well as horizontal and vertical synchronization signals in order

                                                                                  to correctly display the picture on a connected monitor

                                                                                  70

                                                                                  REFERENCES

                                                                                  1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                  2 A VGA display controller by Eduardo Sanchez

                                                                                  3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                  71

                                                                                  • Pixel
                                                                                    • 6 SIGNALS IN VGA CONTROLLER
                                                                                    • Gates
                                                                                    • Market size
                                                                                    • 122 FPGA Comparisons

                                                                                    11 VGA CONTROLLER CODE

                                                                                    library IEEE

                                                                                    use IEEESTD_LOGIC_1164ALL

                                                                                    use IEEESTD_LOGIC_ARITHALL

                                                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                    use IEEEnumeric_stdALL

                                                                                    ---- Uncomment the following library declaration if instantiating

                                                                                    ---- any Xilinx primitives in this code

                                                                                    --library UNISIM

                                                                                    --use UNISIMVComponentsall

                                                                                    entity vga_controller is port(clk50min std_logic

                                                                                    rstnin std_logic

                                                                                    hsyncout std_logic

                                                                                    vsyncout std_logic

                                                                                    redout std_logic_vector(0 downto 0)

                                                                                    greenout std_logic_vector(0 downto 0)

                                                                                    blueout std_logic_vector(0 downto 0))

                                                                                    end vga_controller

                                                                                    architecture Behavioral of vga_controller is

                                                                                    component memory

                                                                                    port (

                                                                                    addr IN std_logic_VECTOR(17 downto 0)

                                                                                    clk IN std_logic

                                                                                    dout OUT std_logic_VECTOR(0 downto 0)

                                                                                    en IN std_logic

                                                                                    sinit IN std_logic)

                                                                                    END component

                                                                                    42

                                                                                    component sync_unit port(

                                                                                    clkin std_logic

                                                                                    rstnin std_logic

                                                                                    hsyncout std_logic

                                                                                    ---horizontal synch pulse(row)

                                                                                    vsyncout std_logic

                                                                                    --vertical sync pulse(frame) 111110100000000000

                                                                                    v_en_regout std_logic

                                                                                    h_enout std_logic

                                                                                    v_enout std_logic

                                                                                    line_countinout integer

                                                                                    c_hsinout integer)

                                                                                    end component

                                                                                    component pixelgen port(

                                                                                    clkin std_logic

                                                                                    rstnin std_logic

                                                                                    hsyncin std_logic

                                                                                    vsyncin std_logic

                                                                                    h_enin std_logic

                                                                                    v_enin std_logic

                                                                                    v_en_reg in std_logic

                                                                                    data_inin std_logic_vector(0 downto 0)

                                                                                    line_countin integer

                                                                                    c_hsin integer

                                                                                    red out std_logic_vector(0 downto 0)

                                                                                    greenout std_logic_vector(0 downto 0)

                                                                                    blueout std_logic_vector(0 downto 0)

                                                                                    addrout std_logic_vector(17 downto 0)

                                                                                    rdout std_logic)

                                                                                    end component

                                                                                    43

                                                                                    signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                                    signal line_count_sc_hs_saddrinteger

                                                                                    signal data_sstd_logic_vector(0 downto 0)

                                                                                    signal addr_sstd_logic_vector(17 downto 0)

                                                                                    begin

                                                                                    --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                                    process(clk50mrstn)

                                                                                    begin

                                                                                    if(rstn = 0)then

                                                                                    clk lt= 0

                                                                                    elsif(clk50m = 1 and clk50mevent)then

                                                                                    clk lt= not clk

                                                                                    end if

                                                                                    --end if

                                                                                    end process

                                                                                    hsync lt= hsync_s

                                                                                    vsync lt= vsync_s

                                                                                    sync sync_unit port map(

                                                                                    clk =gt clk

                                                                                    rstn =gt rstn

                                                                                    hsync =gt hsync_s

                                                                                    vsync =gt vsync_s

                                                                                    v_en_reg =gt v_en_reg_s

                                                                                    line_count =gt line_count_s

                                                                                    h_en =gt h_en_s

                                                                                    v_en =gt v_en_s

                                                                                    c_hs =gt c_hs_s

                                                                                    )

                                                                                    44

                                                                                    pixel pixelgen port map(

                                                                                    clk =gt clk

                                                                                    rstn =gt rstn

                                                                                    hsync =gt hsync_s

                                                                                    vsync =gt vsync_s

                                                                                    h_en =gt h_en_s

                                                                                    v_en =gt v_en_s

                                                                                    v_en_reg =gt v_en_reg_s

                                                                                    data_in =gt data_s

                                                                                    line_count=gt line_count_s

                                                                                    c_hs =gt c_hs_s

                                                                                    red =gt red

                                                                                    green =gt green

                                                                                    blue =gt blue

                                                                                    addr =gt addr_s

                                                                                    rd =gt rd_s

                                                                                    )

                                                                                    mem12 memory port map(

                                                                                    addr =gt addr_s

                                                                                    clk =gt clk

                                                                                    dout =gt data_s

                                                                                    en =gt rd_s

                                                                                    sinit=gt rstn

                                                                                    )

                                                                                    end Behavioral

                                                                                    45

                                                                                    SYNCHRONIZATION BLOCK CODE

                                                                                    library IEEE

                                                                                    use IEEESTD_LOGIC_1164ALL

                                                                                    use IEEESTD_LOGIC_ARITHALL

                                                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                    entity sync_unit is port(

                                                                                    clkin std_logic

                                                                                    rstnin std_logic

                                                                                    hsyncout std_logic ---horizontal synch pulse(row)

                                                                                    vsyncout std_logic --vertical sync

                                                                                    pulse(frame) 111110100000000000

                                                                                    v_en_regout std_logic

                                                                                    h_enout std_logic

                                                                                    v_enout std_logic

                                                                                    line_countout integer

                                                                                    c_hsout integer )

                                                                                    end sync_unit

                                                                                    architecture arch_sync_unit of sync_unit is

                                                                                    signal h_en_sv_en_sstd_logic

                                                                                    signal line_count_sc_hs_sinteger

                                                                                    begin

                                                                                    c_hs lt= c_hs_s

                                                                                    line_count lt= line_count_s

                                                                                    h_en lt= h_en_s

                                                                                    v_en lt= v_en_s

                                                                                    -- hsync counter

                                                                                    process(clkrstn)

                                                                                    begin

                                                                                    if(rstn = 0)then

                                                                                    46

                                                                                    c_hs_s lt= 0

                                                                                    else

                                                                                    if(clk = 1 and clkevent)then

                                                                                    if(c_hs_s=793)then

                                                                                    c_hs_s lt= 0

                                                                                    else

                                                                                    c_hs_s lt= c_hs_s +1

                                                                                    end if

                                                                                    end if

                                                                                    end if

                                                                                    end process

                                                                                    --vsync counter

                                                                                    --vertical line counter

                                                                                    process(h_en_srstn)

                                                                                    begin

                                                                                    if(rstn = 0)then

                                                                                    line_count_s lt= 0

                                                                                    elsif(h_en_s = 1 and h_en_sevent)then

                                                                                    line_count_s lt= line_count_s + 1

                                                                                    if (line_count_s=515)then

                                                                                    line_count_s lt= 0

                                                                                    end if

                                                                                    end if

                                                                                    --end if

                                                                                    end process

                                                                                    --hysnc pulse

                                                                                    process(clkrstn)

                                                                                    begin

                                                                                    if(rstn = 0)then

                                                                                    47

                                                                                    hsync lt= 1

                                                                                    elsif(clk = 1 and clkevent)then

                                                                                    if(c_hs_s lt=95)then

                                                                                    hsync lt= 0

                                                                                    else

                                                                                    hsync lt= 1

                                                                                    end if

                                                                                    end if

                                                                                    --end if

                                                                                    end process

                                                                                    process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                                    this period)

                                                                                    begin

                                                                                    if(rstn = 0)then

                                                                                    h_en_s lt= 0

                                                                                    elsif(clk = 1 and clkevent)then

                                                                                    if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                                    h_en_s lt= 1

                                                                                    else

                                                                                    h_en_s lt= 0

                                                                                    end if

                                                                                    end if

                                                                                    --end if

                                                                                    end process

                                                                                    ---vysnc pulse

                                                                                    process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                                    begin

                                                                                    if(rstn = 0)then

                                                                                    48

                                                                                    v_en_s lt= 0

                                                                                    elsif(clk = 1 and clkevent)then

                                                                                    if((line_count_sgt=35 and line_count_slt=515))then

                                                                                    v_en_s lt= 1

                                                                                    else

                                                                                    v_en_s lt= 0

                                                                                    end if

                                                                                    end if

                                                                                    --end if

                                                                                    end process

                                                                                    process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                    begin

                                                                                    if(rstn = 0)then

                                                                                    vsync lt= 1

                                                                                    elsif(clk = 1 and clkevent)then

                                                                                    if(line_count_s lt2)then

                                                                                    vsync lt= 0

                                                                                    else

                                                                                    vsync lt= 1

                                                                                    end if

                                                                                    end if

                                                                                    --end if

                                                                                    end process

                                                                                    process(clk)

                                                                                    begin

                                                                                    if(clk = 1 and clkevent)then

                                                                                    v_en_reglt=v_en_s

                                                                                    end if

                                                                                    end process

                                                                                    end arch_sync_unit

                                                                                    49

                                                                                    PIXEL GENERATOR

                                                                                    library IEEE

                                                                                    use IEEESTD_LOGIC_1164ALL

                                                                                    use IEEESTD_LOGIC_ARITHALL

                                                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                    entity pixelgen is port(

                                                                                    clkin std_logic

                                                                                    rstnin std_logic

                                                                                    hsyncin std_logic

                                                                                    vsyncin std_logic

                                                                                    h_enin std_logic

                                                                                    v_enin std_logic

                                                                                    v_en_reg in std_logic

                                                                                    data_inin std_logic_vector(0 downto 0)

                                                                                    line_countin integer

                                                                                    c_hsin integer

                                                                                    red out std_logic_vector(0 downto 0)

                                                                                    greenout std_logic_vector(0 downto 0)

                                                                                    blueout std_logic_vector(0 downto 0)

                                                                                    addrout std_logic_vector(17 downto 0)

                                                                                    rdout std_logic

                                                                                    )

                                                                                    end pixelgen

                                                                                    architecture behv of pixelgen is

                                                                                    signal addr_sstd_logic_vector(17 downto 0)

                                                                                    begin

                                                                                    addr lt= addr_s

                                                                                    process(clkrstn)

                                                                                    begin

                                                                                    50

                                                                                    if(rstn = 0)then

                                                                                    rd lt= 0

                                                                                    red lt= 0

                                                                                    green lt= 0

                                                                                    blue lt= 0

                                                                                    addr_s lt= x0000 amp 00

                                                                                    elsif (clk = 1 and clkevent)then

                                                                                    if((v_en=1) and (v_en_reg=0))then

                                                                                    addr_slt=(others =gt0)

                                                                                    end if

                                                                                    if(v_en=1 and h_en=1)then

                                                                                    if(line_countlt=514)then

                                                                                    if(c_hs lt= 655)then

                                                                                    rd lt= 1

                                                                                    red lt= data_in

                                                                                    green lt= data_in

                                                                                    blue lt= data_in

                                                                                    addr_s lt=addr_s +000000000000000001

                                                                                    else

                                                                                    rd lt= 0

                                                                                    red lt= 0

                                                                                    green lt= 1

                                                                                    blue lt= 0

                                                                                    end if

                                                                                    else

                                                                                    rd lt= 0

                                                                                    red lt= 1

                                                                                    green lt= 1

                                                                                    blue lt= 1

                                                                                    51

                                                                                    end if

                                                                                    if(addr_s=111011111111111111)then

                                                                                    addr_slt= (others =gt0)

                                                                                    end if

                                                                                    else

                                                                                    red lt= 0

                                                                                    green lt= 0

                                                                                    blue lt= 0

                                                                                    end if

                                                                                    end if

                                                                                    -- end if

                                                                                    --end if

                                                                                    end process

                                                                                    end behv

                                                                                    52

                                                                                    12 FPGA ARCHITECTURE

                                                                                    A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                    be configured by the customer or designer after manufacturingmdashhence the

                                                                                    name field-programmable FPGAs are programmed using a logic circuit

                                                                                    diagram or a source code in a hardware description language (HDL) to specify

                                                                                    how the chip will work They can be used to implement any logical function

                                                                                    that an application-specific integrated circuit (ASIC) could perform but the

                                                                                    ability to update the functionality after shipping offers advantages for many

                                                                                    applications

                                                                                    FPGAs contain programmable logic components called logic blocks and a

                                                                                    hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                    togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                    can be configured to perform complex combinational functions or merely

                                                                                    simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                    include memory elements which may be simple flip-flops or more complete

                                                                                    blocks of memory

                                                                                    History The FPGA industry sprouted from programmable read only

                                                                                    memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                    both had the option of being programmed in batches in a factory or in the field

                                                                                    (field programmable) however programmable logic was hard-wired between

                                                                                    logic gates

                                                                                    Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                    first commercially viable field programmable gate array in 1985 ndash the

                                                                                    XC2064 The XC2064 had programmable gates and programmable

                                                                                    interconnects between gates the beginnings of a new technology and market

                                                                                    The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                    input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                    into the National Inventors Hall of Fame for his invention

                                                                                    53

                                                                                    Some of the industryrsquos foundational concepts and technologies for

                                                                                    programmable logic arrays gates and logic blocks are founded in patents

                                                                                    awarded to David W Page and LuVerne R Peterson in 1985

                                                                                    In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                    proposed by Steve Casselman to develop a computer that would implement

                                                                                    600000 reprogrammable gates Casselman was successful and the system was

                                                                                    awarded a patent in 1992

                                                                                    Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                    1990s when competitors sprouted up eroding significant market-share By

                                                                                    1993 Actel was serving about 18 percent of the market

                                                                                    The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                    and the volume of production In the early 1990s FPGAs were primarily used

                                                                                    in telecommunications and networking By the end of the decade FPGAs

                                                                                    found their way into consumer automotive and industrial applications

                                                                                    FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                    algorithm technology and FPGAs to create a sound recognition device

                                                                                    Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                    to decide the configuration needed to accomplish a sound recognition task

                                                                                    121 Modern developments

                                                                                    A recent trend has been to take the coarse-grained architectural

                                                                                    approach a step further by combining the logic blocks and interconnects of

                                                                                    traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                    form a complete system on a programmable chip This work mirrors the

                                                                                    architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                    Group which combined a reconfigurable CPU architecture on a single chip

                                                                                    called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                    technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                    which include one or more PowerPC processors embedded within the FPGAs

                                                                                    54

                                                                                    logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                    processor in combination with Atmels programmable logic architecture

                                                                                    An alternate approach to using hard-macro processors is to make use of soft

                                                                                    processor cores that are implemented within the FPGA logic (See Soft

                                                                                    processors below)

                                                                                    As previously mentioned many modern FPGAs have the ability to be

                                                                                    reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                    computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                    suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                    example of a reconfigurable soft processor implemented on FPGAs However

                                                                                    it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                    to a specific program

                                                                                    Additionally new non-FPGA architectures are beginning to emerge

                                                                                    Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                    hybrid approach by providing an array of processor cores and FPGA-like

                                                                                    programmable cores on the same chip

                                                                                    Gates

                                                                                    1987 9000 gates Xilinx

                                                                                    1992 600000 Naval Surface Warfare Department

                                                                                    Early 2000s Millions

                                                                                    Market size

                                                                                    1985 First commercial FPGA technology invented by Xilinx

                                                                                    1987 $14 million

                                                                                    ~1993 gt$385 million

                                                                                    2005 $19 billion

                                                                                    2010 estimates $275 billion

                                                                                    55

                                                                                    122 FPGA Comparisons

                                                                                    Historically FPGAs have been slower less energy efficient and generally

                                                                                    achieved less functionality than their fixed ASIC counterparts A combination

                                                                                    of volume fabrication improvements research and development and the IO

                                                                                    capabilities of new supercomputers have largely closed the performance gap

                                                                                    between ASICs and FPGAs

                                                                                    Advantages include a shorter time to market ability to re-program in the field

                                                                                    to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                    middle road by developing their hardware on ordinary FPGAs but

                                                                                    manufacture their final version so it can no longer be modified after the design

                                                                                    has been committed

                                                                                    Xilinx claims that several market and technology dynamics are changing the

                                                                                    ASICFPGA paradigm

                                                                                    IC costs are rising aggressively

                                                                                    ASIC complexity has bolstered development time and costs

                                                                                    RampD resources and headcount is decreasing

                                                                                    Revenue losses for slow time-to-market are increasing

                                                                                    Financial constraints in a poor economy are driving low-cost technologies

                                                                                    These trends make FPGAs a better alternative than ASICs for a growing

                                                                                    number of higher-volume applications than they have been historically used

                                                                                    for which the company blames for the growing number of FPGA design starts

                                                                                    (see History)

                                                                                    The primary differences between CPLDs and FPGAs are architectural A

                                                                                    CPLD has a somewhat restrictive structure consisting of one or more

                                                                                    programmable sum-of-products logic arrays feeding a relatively small number

                                                                                    of clocked registers The result of this is less flexibility with the advantage of

                                                                                    more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                    FPGA architectures on the other hand are dominated by interconnect This

                                                                                    56

                                                                                    makes them far more flexible (in terms of the range of designs that are

                                                                                    practical for implementation within them) but also far more complex to design

                                                                                    for

                                                                                    Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                    FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                    and embedded memories as well as to have logic blocks implement decoders

                                                                                    or mathematical functions

                                                                                    Some FPGAs have the capability of partial re-configuration that lets one

                                                                                    portion of the device be re-programmed while other portions continue running

                                                                                    123 Applications

                                                                                    Applications of FPGAs include digital signal processing software-defined

                                                                                    radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                    computer vision speech recognition cryptography bioinformatics computer

                                                                                    hardware emulation radio astronomy and a growing range of other areas

                                                                                    FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                    space that of glue logic for PCBs As their size capabilities and speed

                                                                                    increased they began to take over larger and larger functions to the state where

                                                                                    some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                    introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                    applications which had traditionally been the sole reserve of DSPs began to

                                                                                    incorporate FPGAs instead

                                                                                    FPGAs especially find applications in any area or algorithm that can make use

                                                                                    of the massive parallelism offered by their architecture One such area is code

                                                                                    breaking in particular brute-force attack of cryptographic algorithms

                                                                                    FPGAs are increasingly used in conventional high performance computing

                                                                                    applications where computational kernels such as FFT or Convolution are

                                                                                    performed on the FPGA instead of a microprocessor

                                                                                    57

                                                                                    The inherent parallelism of the logic resources on an FPGA allows for

                                                                                    considerable computational throughput even at a low MHz clock rates The

                                                                                    flexibility of the FPGA allows for even higher performance by trading off

                                                                                    precision and range in the number format for an increased number of parallel

                                                                                    arithmetic units This has driven a new type of processing called

                                                                                    reconfigurable computing where time intensive tasks are offloaded from

                                                                                    software to FPGAs

                                                                                    The adoption of FPGAs in high performance computing is currently limited by

                                                                                    the complexity of FPGA design compared to conventional software and the

                                                                                    extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                    is necessary after even minor changes to the source code

                                                                                    Traditionally FPGAs have been reserved for specific vertical applications

                                                                                    where the volume of production is small For these low-volume applications

                                                                                    the premium that companies pay in hardware costs per unit for a

                                                                                    programmable chip is more affordable than the development resources spent

                                                                                    on creating an ASIC for a low-volume application Today new cost and

                                                                                    performance dynamics have broadened the range of viable applications

                                                                                    Architecture

                                                                                    The most common FPGA architecture consists of an array of configurable

                                                                                    logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                    channels have the same width (number of wires) Multiple IO pads may fit

                                                                                    into the height of one row or the width of one column in the array

                                                                                    An application circuit must be mapped into an FPGA with adequate resources

                                                                                    While the number of CLBs and IOs required is easily determined from the

                                                                                    design the number of routing tracks needed may vary considerably even

                                                                                    among designs with the same amount of logic (For example a crossbar switch

                                                                                    requires much more routing than a systolic array with the same gate count)

                                                                                    Since unused routing tracks increase the cost (and decrease the performance)

                                                                                    of the part without providing any benefit FPGA manufacturers try to provide

                                                                                    58

                                                                                    just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                    can be routed This is determined by estimates such as those derived from

                                                                                    Rents rule or by experiments with existing designs

                                                                                    The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                    blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                    width of one column as shown below All the routing channels have the same

                                                                                    width (number of wires)

                                                                                    121 FPGA structure

                                                                                    A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                    flip-flop as shown below In recent years manufacturers have started moving

                                                                                    to 6-input LUTs in their high performance parts claiming increased

                                                                                    performance

                                                                                    Typical logic block

                                                                                    59

                                                                                    There is only one output which can be either the registered or the unregistered

                                                                                    LUT output The logic block has four inputs for the LUT and a clock input

                                                                                    Since clock signals (and often other high-fanout signals) are normally routed

                                                                                    via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                    and other signals are separately managed

                                                                                    For this example architecture the locations of the FPGA logic block pins are

                                                                                    shown below

                                                                                    Logic Block Pin Locations

                                                                                    Each input is accessible from one side of the logic block while the output pin

                                                                                    can connect to routing wires in both the channel to the right and the channel

                                                                                    below the logic block

                                                                                    Each logic block output pin can connect to any of the wiring segments in the

                                                                                    channels adjacent to it

                                                                                    Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                    channel adjacent to it For example an IO pad at the top of the chip can

                                                                                    connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                    channel immediately below it

                                                                                    Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                    spans only one logic block before it terminates in a switch box By turning on

                                                                                    60

                                                                                    some of the programmable switches within a switch box longer paths can be

                                                                                    constructed For higher speed interconnect some FPGA architectures use

                                                                                    longer routing lines that span multiple logic blocks

                                                                                    Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                    this architecture when a wire enters a switch box there are three

                                                                                    programmable switches that allow it to connect to three other wires in adjacent

                                                                                    channel segments The pattern or topology of switches used in this

                                                                                    architecture is the planar or domain-based switch box topology In this switch

                                                                                    box topology a wire in track number one connects only to wires in track

                                                                                    number one in adjacent channel segments wires in track number 2 connect

                                                                                    only to other wires in track number 2 and so on The figure below illustrates

                                                                                    the connections in a switch box

                                                                                    Switch box topology

                                                                                    Modern FPGA families expand upon the above capabilities to include higher

                                                                                    level functionality fixed into the silicon Having these common functions

                                                                                    embedded into the silicon reduces the area required and gives those functions

                                                                                    increased speed compared to building them from primitives Examples of these

                                                                                    include multipliers generic DSP blocks embedded processors high speed IO

                                                                                    logic and embedded memories

                                                                                    61

                                                                                    FPGAs are also widely used for systems validation including pre-silicon

                                                                                    validation post-silicon validation and firmware development This allows chip

                                                                                    companies to validate their design before the chip is produced in the factory

                                                                                    reducing the time to market

                                                                                    124 FPGA Design and Programming

                                                                                    To define the behavior of the FPGA the user provides a hardware description

                                                                                    language (HDL) or a schematic design The HDL form might be easier to work

                                                                                    with when handling large structures because its possible to just specify them

                                                                                    numerically rather than having to draw every piece by hand On the other hand

                                                                                    schematic entry can allow for easier visualisation of a design

                                                                                    Then using an electronic design automation tool a technology-mapped netlist

                                                                                    is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                    using a process called place-and-route usually performed by the FPGA

                                                                                    companys proprietary place-and-route software The user will validate the

                                                                                    map place and route results via timing analysis simulation and other

                                                                                    verification methodologies Once the design and validation process is

                                                                                    complete the binary file generated (also using the FPGA companys

                                                                                    proprietary software) is used to (re)configure the FPGA

                                                                                    Going from schematicHDL source files to actual configuration The source

                                                                                    files are fed to a software suite from the FPGACPLD vendor that through

                                                                                    different steps will produce a file This file is then transferred to the

                                                                                    FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                    like an EEPROM

                                                                                    The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                    reduce the complexity of designing in HDLs which have been compared to the

                                                                                    equivalent of assembly languages there are moves to raise the abstraction level

                                                                                    through the introduction of alternative languages

                                                                                    62

                                                                                    To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                    predefined complex functions and circuits that have been tested and optimized

                                                                                    to speed up the design process These predefined circuits are commonly called

                                                                                    IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                    (rarely free and typically released under proprietary licenses) Other

                                                                                    predefined circuits are available from developer communities such as Open

                                                                                    Cores (typically free and released under the GPL BSD or similar license) and

                                                                                    other sources

                                                                                    In a typical design flow an FPGA application developer will simulate the

                                                                                    design at multiple stages throughout the design process Initially the RTL

                                                                                    description in VHDL or Verilog is simulated by creating test benches to

                                                                                    simulate the system and observe results Then after the synthesis engine has

                                                                                    mapped the design to a netlist the netlist is translated to a gate level

                                                                                    description where simulation is repeated to confirm the synthesis proceeded

                                                                                    without errors Finally the design is laid out in the FPGA at which point

                                                                                    propagation delays can be added and the simulation run again with these

                                                                                    values back-annotated onto the netlist

                                                                                    Basic Process Technology Types

                                                                                    SRAM - based on static memory technology In-system programmable and re-

                                                                                    programmable Requires external boot devices CMOS

                                                                                    Antifuse - One-time programmable CMOS

                                                                                    EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                    one-time programmable in production because of plastic packaging

                                                                                    Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                    EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                    technology Can be erased even in plastic packages Some but not all

                                                                                    EEPROM devices can be in-system programmed CMOS

                                                                                    Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                    packages Some but not all flash devices can be in-system programmed

                                                                                    63

                                                                                    Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                    therefore less expensive to manufacture CMOS

                                                                                    Fuse - One-time programmable Bipolar

                                                                                    Major Manufacturers

                                                                                    Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                    rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                    representing over 50 percent

                                                                                    Xilinx also provides free Windows and Linux design software while Altera

                                                                                    provides free Windows tools the Solaris and Linux tools are only available via

                                                                                    a rental scheme

                                                                                    Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                    (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                    Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                    focused CSSP no general purpose FPGAs)

                                                                                    125 FPGA prototype

                                                                                    FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                    prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                    hardware verification and early software development

                                                                                    Main stream verification methods for hardware design and early software and

                                                                                    firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                    design on FPGA has become a good method to do this

                                                                                    64

                                                                                    Reasons why Prototyping is important

                                                                                    1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                    functionally correct This is compared to designers only relying on software

                                                                                    simulations to verify that their hardware design is sound Simulation speed and

                                                                                    modeling accuracy limitations hinder this development

                                                                                    2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                    the foundry to start on software tests FPGA prototyping allows for much more

                                                                                    time in area of software development and testing at the software-hardware

                                                                                    integration stage This allows many unforeseen software bugs that appear due

                                                                                    to todays array of operating systems applications and hardware

                                                                                    3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                    system work well together off the simulation stage and in actual form

                                                                                    4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                    bringing in interest early This speeds up the overall development cycle and

                                                                                    allows for more enhancement or improvement to the chip features as it would

                                                                                    otherwise have been

                                                                                    65

                                                                                    13 SIMULATION RESULTS

                                                                                    66

                                                                                    GATE LEVEL

                                                                                    FigVGA CONTROLLER

                                                                                    67

                                                                                    GATE LEVEL

                                                                                    Fig SYNC UNIT

                                                                                    68

                                                                                    TECHNOLOGY SCHEMATIC

                                                                                    69

                                                                                    14 APPLICATIONS

                                                                                    Motion Tracking Systems

                                                                                    Image processing systems

                                                                                    Displaying Systems

                                                                                    141 CONCLUTION

                                                                                    Hence by implementing FPGA taking a processor-generated picture

                                                                                    (pixilated) from memory space the Controller provides digital RGB values for

                                                                                    each pixel as well as horizontal and vertical synchronization signals in order

                                                                                    to correctly display the picture on a connected monitor

                                                                                    70

                                                                                    REFERENCES

                                                                                    1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                    2 A VGA display controller by Eduardo Sanchez

                                                                                    3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                    71

                                                                                    • Pixel
                                                                                      • 6 SIGNALS IN VGA CONTROLLER
                                                                                      • Gates
                                                                                      • Market size
                                                                                      • 122 FPGA Comparisons

                                                                                      component sync_unit port(

                                                                                      clkin std_logic

                                                                                      rstnin std_logic

                                                                                      hsyncout std_logic

                                                                                      ---horizontal synch pulse(row)

                                                                                      vsyncout std_logic

                                                                                      --vertical sync pulse(frame) 111110100000000000

                                                                                      v_en_regout std_logic

                                                                                      h_enout std_logic

                                                                                      v_enout std_logic

                                                                                      line_countinout integer

                                                                                      c_hsinout integer)

                                                                                      end component

                                                                                      component pixelgen port(

                                                                                      clkin std_logic

                                                                                      rstnin std_logic

                                                                                      hsyncin std_logic

                                                                                      vsyncin std_logic

                                                                                      h_enin std_logic

                                                                                      v_enin std_logic

                                                                                      v_en_reg in std_logic

                                                                                      data_inin std_logic_vector(0 downto 0)

                                                                                      line_countin integer

                                                                                      c_hsin integer

                                                                                      red out std_logic_vector(0 downto 0)

                                                                                      greenout std_logic_vector(0 downto 0)

                                                                                      blueout std_logic_vector(0 downto 0)

                                                                                      addrout std_logic_vector(17 downto 0)

                                                                                      rdout std_logic)

                                                                                      end component

                                                                                      43

                                                                                      signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                                      signal line_count_sc_hs_saddrinteger

                                                                                      signal data_sstd_logic_vector(0 downto 0)

                                                                                      signal addr_sstd_logic_vector(17 downto 0)

                                                                                      begin

                                                                                      --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                                      process(clk50mrstn)

                                                                                      begin

                                                                                      if(rstn = 0)then

                                                                                      clk lt= 0

                                                                                      elsif(clk50m = 1 and clk50mevent)then

                                                                                      clk lt= not clk

                                                                                      end if

                                                                                      --end if

                                                                                      end process

                                                                                      hsync lt= hsync_s

                                                                                      vsync lt= vsync_s

                                                                                      sync sync_unit port map(

                                                                                      clk =gt clk

                                                                                      rstn =gt rstn

                                                                                      hsync =gt hsync_s

                                                                                      vsync =gt vsync_s

                                                                                      v_en_reg =gt v_en_reg_s

                                                                                      line_count =gt line_count_s

                                                                                      h_en =gt h_en_s

                                                                                      v_en =gt v_en_s

                                                                                      c_hs =gt c_hs_s

                                                                                      )

                                                                                      44

                                                                                      pixel pixelgen port map(

                                                                                      clk =gt clk

                                                                                      rstn =gt rstn

                                                                                      hsync =gt hsync_s

                                                                                      vsync =gt vsync_s

                                                                                      h_en =gt h_en_s

                                                                                      v_en =gt v_en_s

                                                                                      v_en_reg =gt v_en_reg_s

                                                                                      data_in =gt data_s

                                                                                      line_count=gt line_count_s

                                                                                      c_hs =gt c_hs_s

                                                                                      red =gt red

                                                                                      green =gt green

                                                                                      blue =gt blue

                                                                                      addr =gt addr_s

                                                                                      rd =gt rd_s

                                                                                      )

                                                                                      mem12 memory port map(

                                                                                      addr =gt addr_s

                                                                                      clk =gt clk

                                                                                      dout =gt data_s

                                                                                      en =gt rd_s

                                                                                      sinit=gt rstn

                                                                                      )

                                                                                      end Behavioral

                                                                                      45

                                                                                      SYNCHRONIZATION BLOCK CODE

                                                                                      library IEEE

                                                                                      use IEEESTD_LOGIC_1164ALL

                                                                                      use IEEESTD_LOGIC_ARITHALL

                                                                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                      entity sync_unit is port(

                                                                                      clkin std_logic

                                                                                      rstnin std_logic

                                                                                      hsyncout std_logic ---horizontal synch pulse(row)

                                                                                      vsyncout std_logic --vertical sync

                                                                                      pulse(frame) 111110100000000000

                                                                                      v_en_regout std_logic

                                                                                      h_enout std_logic

                                                                                      v_enout std_logic

                                                                                      line_countout integer

                                                                                      c_hsout integer )

                                                                                      end sync_unit

                                                                                      architecture arch_sync_unit of sync_unit is

                                                                                      signal h_en_sv_en_sstd_logic

                                                                                      signal line_count_sc_hs_sinteger

                                                                                      begin

                                                                                      c_hs lt= c_hs_s

                                                                                      line_count lt= line_count_s

                                                                                      h_en lt= h_en_s

                                                                                      v_en lt= v_en_s

                                                                                      -- hsync counter

                                                                                      process(clkrstn)

                                                                                      begin

                                                                                      if(rstn = 0)then

                                                                                      46

                                                                                      c_hs_s lt= 0

                                                                                      else

                                                                                      if(clk = 1 and clkevent)then

                                                                                      if(c_hs_s=793)then

                                                                                      c_hs_s lt= 0

                                                                                      else

                                                                                      c_hs_s lt= c_hs_s +1

                                                                                      end if

                                                                                      end if

                                                                                      end if

                                                                                      end process

                                                                                      --vsync counter

                                                                                      --vertical line counter

                                                                                      process(h_en_srstn)

                                                                                      begin

                                                                                      if(rstn = 0)then

                                                                                      line_count_s lt= 0

                                                                                      elsif(h_en_s = 1 and h_en_sevent)then

                                                                                      line_count_s lt= line_count_s + 1

                                                                                      if (line_count_s=515)then

                                                                                      line_count_s lt= 0

                                                                                      end if

                                                                                      end if

                                                                                      --end if

                                                                                      end process

                                                                                      --hysnc pulse

                                                                                      process(clkrstn)

                                                                                      begin

                                                                                      if(rstn = 0)then

                                                                                      47

                                                                                      hsync lt= 1

                                                                                      elsif(clk = 1 and clkevent)then

                                                                                      if(c_hs_s lt=95)then

                                                                                      hsync lt= 0

                                                                                      else

                                                                                      hsync lt= 1

                                                                                      end if

                                                                                      end if

                                                                                      --end if

                                                                                      end process

                                                                                      process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                                      this period)

                                                                                      begin

                                                                                      if(rstn = 0)then

                                                                                      h_en_s lt= 0

                                                                                      elsif(clk = 1 and clkevent)then

                                                                                      if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                                      h_en_s lt= 1

                                                                                      else

                                                                                      h_en_s lt= 0

                                                                                      end if

                                                                                      end if

                                                                                      --end if

                                                                                      end process

                                                                                      ---vysnc pulse

                                                                                      process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                                      begin

                                                                                      if(rstn = 0)then

                                                                                      48

                                                                                      v_en_s lt= 0

                                                                                      elsif(clk = 1 and clkevent)then

                                                                                      if((line_count_sgt=35 and line_count_slt=515))then

                                                                                      v_en_s lt= 1

                                                                                      else

                                                                                      v_en_s lt= 0

                                                                                      end if

                                                                                      end if

                                                                                      --end if

                                                                                      end process

                                                                                      process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                      begin

                                                                                      if(rstn = 0)then

                                                                                      vsync lt= 1

                                                                                      elsif(clk = 1 and clkevent)then

                                                                                      if(line_count_s lt2)then

                                                                                      vsync lt= 0

                                                                                      else

                                                                                      vsync lt= 1

                                                                                      end if

                                                                                      end if

                                                                                      --end if

                                                                                      end process

                                                                                      process(clk)

                                                                                      begin

                                                                                      if(clk = 1 and clkevent)then

                                                                                      v_en_reglt=v_en_s

                                                                                      end if

                                                                                      end process

                                                                                      end arch_sync_unit

                                                                                      49

                                                                                      PIXEL GENERATOR

                                                                                      library IEEE

                                                                                      use IEEESTD_LOGIC_1164ALL

                                                                                      use IEEESTD_LOGIC_ARITHALL

                                                                                      use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                      entity pixelgen is port(

                                                                                      clkin std_logic

                                                                                      rstnin std_logic

                                                                                      hsyncin std_logic

                                                                                      vsyncin std_logic

                                                                                      h_enin std_logic

                                                                                      v_enin std_logic

                                                                                      v_en_reg in std_logic

                                                                                      data_inin std_logic_vector(0 downto 0)

                                                                                      line_countin integer

                                                                                      c_hsin integer

                                                                                      red out std_logic_vector(0 downto 0)

                                                                                      greenout std_logic_vector(0 downto 0)

                                                                                      blueout std_logic_vector(0 downto 0)

                                                                                      addrout std_logic_vector(17 downto 0)

                                                                                      rdout std_logic

                                                                                      )

                                                                                      end pixelgen

                                                                                      architecture behv of pixelgen is

                                                                                      signal addr_sstd_logic_vector(17 downto 0)

                                                                                      begin

                                                                                      addr lt= addr_s

                                                                                      process(clkrstn)

                                                                                      begin

                                                                                      50

                                                                                      if(rstn = 0)then

                                                                                      rd lt= 0

                                                                                      red lt= 0

                                                                                      green lt= 0

                                                                                      blue lt= 0

                                                                                      addr_s lt= x0000 amp 00

                                                                                      elsif (clk = 1 and clkevent)then

                                                                                      if((v_en=1) and (v_en_reg=0))then

                                                                                      addr_slt=(others =gt0)

                                                                                      end if

                                                                                      if(v_en=1 and h_en=1)then

                                                                                      if(line_countlt=514)then

                                                                                      if(c_hs lt= 655)then

                                                                                      rd lt= 1

                                                                                      red lt= data_in

                                                                                      green lt= data_in

                                                                                      blue lt= data_in

                                                                                      addr_s lt=addr_s +000000000000000001

                                                                                      else

                                                                                      rd lt= 0

                                                                                      red lt= 0

                                                                                      green lt= 1

                                                                                      blue lt= 0

                                                                                      end if

                                                                                      else

                                                                                      rd lt= 0

                                                                                      red lt= 1

                                                                                      green lt= 1

                                                                                      blue lt= 1

                                                                                      51

                                                                                      end if

                                                                                      if(addr_s=111011111111111111)then

                                                                                      addr_slt= (others =gt0)

                                                                                      end if

                                                                                      else

                                                                                      red lt= 0

                                                                                      green lt= 0

                                                                                      blue lt= 0

                                                                                      end if

                                                                                      end if

                                                                                      -- end if

                                                                                      --end if

                                                                                      end process

                                                                                      end behv

                                                                                      52

                                                                                      12 FPGA ARCHITECTURE

                                                                                      A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                      be configured by the customer or designer after manufacturingmdashhence the

                                                                                      name field-programmable FPGAs are programmed using a logic circuit

                                                                                      diagram or a source code in a hardware description language (HDL) to specify

                                                                                      how the chip will work They can be used to implement any logical function

                                                                                      that an application-specific integrated circuit (ASIC) could perform but the

                                                                                      ability to update the functionality after shipping offers advantages for many

                                                                                      applications

                                                                                      FPGAs contain programmable logic components called logic blocks and a

                                                                                      hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                      togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                      can be configured to perform complex combinational functions or merely

                                                                                      simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                      include memory elements which may be simple flip-flops or more complete

                                                                                      blocks of memory

                                                                                      History The FPGA industry sprouted from programmable read only

                                                                                      memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                      both had the option of being programmed in batches in a factory or in the field

                                                                                      (field programmable) however programmable logic was hard-wired between

                                                                                      logic gates

                                                                                      Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                      first commercially viable field programmable gate array in 1985 ndash the

                                                                                      XC2064 The XC2064 had programmable gates and programmable

                                                                                      interconnects between gates the beginnings of a new technology and market

                                                                                      The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                      input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                      into the National Inventors Hall of Fame for his invention

                                                                                      53

                                                                                      Some of the industryrsquos foundational concepts and technologies for

                                                                                      programmable logic arrays gates and logic blocks are founded in patents

                                                                                      awarded to David W Page and LuVerne R Peterson in 1985

                                                                                      In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                      proposed by Steve Casselman to develop a computer that would implement

                                                                                      600000 reprogrammable gates Casselman was successful and the system was

                                                                                      awarded a patent in 1992

                                                                                      Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                      1990s when competitors sprouted up eroding significant market-share By

                                                                                      1993 Actel was serving about 18 percent of the market

                                                                                      The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                      and the volume of production In the early 1990s FPGAs were primarily used

                                                                                      in telecommunications and networking By the end of the decade FPGAs

                                                                                      found their way into consumer automotive and industrial applications

                                                                                      FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                      algorithm technology and FPGAs to create a sound recognition device

                                                                                      Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                      to decide the configuration needed to accomplish a sound recognition task

                                                                                      121 Modern developments

                                                                                      A recent trend has been to take the coarse-grained architectural

                                                                                      approach a step further by combining the logic blocks and interconnects of

                                                                                      traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                      form a complete system on a programmable chip This work mirrors the

                                                                                      architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                      Group which combined a reconfigurable CPU architecture on a single chip

                                                                                      called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                      technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                      which include one or more PowerPC processors embedded within the FPGAs

                                                                                      54

                                                                                      logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                      processor in combination with Atmels programmable logic architecture

                                                                                      An alternate approach to using hard-macro processors is to make use of soft

                                                                                      processor cores that are implemented within the FPGA logic (See Soft

                                                                                      processors below)

                                                                                      As previously mentioned many modern FPGAs have the ability to be

                                                                                      reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                      computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                      suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                      example of a reconfigurable soft processor implemented on FPGAs However

                                                                                      it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                      to a specific program

                                                                                      Additionally new non-FPGA architectures are beginning to emerge

                                                                                      Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                      hybrid approach by providing an array of processor cores and FPGA-like

                                                                                      programmable cores on the same chip

                                                                                      Gates

                                                                                      1987 9000 gates Xilinx

                                                                                      1992 600000 Naval Surface Warfare Department

                                                                                      Early 2000s Millions

                                                                                      Market size

                                                                                      1985 First commercial FPGA technology invented by Xilinx

                                                                                      1987 $14 million

                                                                                      ~1993 gt$385 million

                                                                                      2005 $19 billion

                                                                                      2010 estimates $275 billion

                                                                                      55

                                                                                      122 FPGA Comparisons

                                                                                      Historically FPGAs have been slower less energy efficient and generally

                                                                                      achieved less functionality than their fixed ASIC counterparts A combination

                                                                                      of volume fabrication improvements research and development and the IO

                                                                                      capabilities of new supercomputers have largely closed the performance gap

                                                                                      between ASICs and FPGAs

                                                                                      Advantages include a shorter time to market ability to re-program in the field

                                                                                      to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                      middle road by developing their hardware on ordinary FPGAs but

                                                                                      manufacture their final version so it can no longer be modified after the design

                                                                                      has been committed

                                                                                      Xilinx claims that several market and technology dynamics are changing the

                                                                                      ASICFPGA paradigm

                                                                                      IC costs are rising aggressively

                                                                                      ASIC complexity has bolstered development time and costs

                                                                                      RampD resources and headcount is decreasing

                                                                                      Revenue losses for slow time-to-market are increasing

                                                                                      Financial constraints in a poor economy are driving low-cost technologies

                                                                                      These trends make FPGAs a better alternative than ASICs for a growing

                                                                                      number of higher-volume applications than they have been historically used

                                                                                      for which the company blames for the growing number of FPGA design starts

                                                                                      (see History)

                                                                                      The primary differences between CPLDs and FPGAs are architectural A

                                                                                      CPLD has a somewhat restrictive structure consisting of one or more

                                                                                      programmable sum-of-products logic arrays feeding a relatively small number

                                                                                      of clocked registers The result of this is less flexibility with the advantage of

                                                                                      more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                      FPGA architectures on the other hand are dominated by interconnect This

                                                                                      56

                                                                                      makes them far more flexible (in terms of the range of designs that are

                                                                                      practical for implementation within them) but also far more complex to design

                                                                                      for

                                                                                      Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                      FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                      and embedded memories as well as to have logic blocks implement decoders

                                                                                      or mathematical functions

                                                                                      Some FPGAs have the capability of partial re-configuration that lets one

                                                                                      portion of the device be re-programmed while other portions continue running

                                                                                      123 Applications

                                                                                      Applications of FPGAs include digital signal processing software-defined

                                                                                      radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                      computer vision speech recognition cryptography bioinformatics computer

                                                                                      hardware emulation radio astronomy and a growing range of other areas

                                                                                      FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                      space that of glue logic for PCBs As their size capabilities and speed

                                                                                      increased they began to take over larger and larger functions to the state where

                                                                                      some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                      introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                      applications which had traditionally been the sole reserve of DSPs began to

                                                                                      incorporate FPGAs instead

                                                                                      FPGAs especially find applications in any area or algorithm that can make use

                                                                                      of the massive parallelism offered by their architecture One such area is code

                                                                                      breaking in particular brute-force attack of cryptographic algorithms

                                                                                      FPGAs are increasingly used in conventional high performance computing

                                                                                      applications where computational kernels such as FFT or Convolution are

                                                                                      performed on the FPGA instead of a microprocessor

                                                                                      57

                                                                                      The inherent parallelism of the logic resources on an FPGA allows for

                                                                                      considerable computational throughput even at a low MHz clock rates The

                                                                                      flexibility of the FPGA allows for even higher performance by trading off

                                                                                      precision and range in the number format for an increased number of parallel

                                                                                      arithmetic units This has driven a new type of processing called

                                                                                      reconfigurable computing where time intensive tasks are offloaded from

                                                                                      software to FPGAs

                                                                                      The adoption of FPGAs in high performance computing is currently limited by

                                                                                      the complexity of FPGA design compared to conventional software and the

                                                                                      extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                      is necessary after even minor changes to the source code

                                                                                      Traditionally FPGAs have been reserved for specific vertical applications

                                                                                      where the volume of production is small For these low-volume applications

                                                                                      the premium that companies pay in hardware costs per unit for a

                                                                                      programmable chip is more affordable than the development resources spent

                                                                                      on creating an ASIC for a low-volume application Today new cost and

                                                                                      performance dynamics have broadened the range of viable applications

                                                                                      Architecture

                                                                                      The most common FPGA architecture consists of an array of configurable

                                                                                      logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                      channels have the same width (number of wires) Multiple IO pads may fit

                                                                                      into the height of one row or the width of one column in the array

                                                                                      An application circuit must be mapped into an FPGA with adequate resources

                                                                                      While the number of CLBs and IOs required is easily determined from the

                                                                                      design the number of routing tracks needed may vary considerably even

                                                                                      among designs with the same amount of logic (For example a crossbar switch

                                                                                      requires much more routing than a systolic array with the same gate count)

                                                                                      Since unused routing tracks increase the cost (and decrease the performance)

                                                                                      of the part without providing any benefit FPGA manufacturers try to provide

                                                                                      58

                                                                                      just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                      can be routed This is determined by estimates such as those derived from

                                                                                      Rents rule or by experiments with existing designs

                                                                                      The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                      blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                      width of one column as shown below All the routing channels have the same

                                                                                      width (number of wires)

                                                                                      121 FPGA structure

                                                                                      A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                      flip-flop as shown below In recent years manufacturers have started moving

                                                                                      to 6-input LUTs in their high performance parts claiming increased

                                                                                      performance

                                                                                      Typical logic block

                                                                                      59

                                                                                      There is only one output which can be either the registered or the unregistered

                                                                                      LUT output The logic block has four inputs for the LUT and a clock input

                                                                                      Since clock signals (and often other high-fanout signals) are normally routed

                                                                                      via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                      and other signals are separately managed

                                                                                      For this example architecture the locations of the FPGA logic block pins are

                                                                                      shown below

                                                                                      Logic Block Pin Locations

                                                                                      Each input is accessible from one side of the logic block while the output pin

                                                                                      can connect to routing wires in both the channel to the right and the channel

                                                                                      below the logic block

                                                                                      Each logic block output pin can connect to any of the wiring segments in the

                                                                                      channels adjacent to it

                                                                                      Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                      channel adjacent to it For example an IO pad at the top of the chip can

                                                                                      connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                      channel immediately below it

                                                                                      Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                      spans only one logic block before it terminates in a switch box By turning on

                                                                                      60

                                                                                      some of the programmable switches within a switch box longer paths can be

                                                                                      constructed For higher speed interconnect some FPGA architectures use

                                                                                      longer routing lines that span multiple logic blocks

                                                                                      Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                      this architecture when a wire enters a switch box there are three

                                                                                      programmable switches that allow it to connect to three other wires in adjacent

                                                                                      channel segments The pattern or topology of switches used in this

                                                                                      architecture is the planar or domain-based switch box topology In this switch

                                                                                      box topology a wire in track number one connects only to wires in track

                                                                                      number one in adjacent channel segments wires in track number 2 connect

                                                                                      only to other wires in track number 2 and so on The figure below illustrates

                                                                                      the connections in a switch box

                                                                                      Switch box topology

                                                                                      Modern FPGA families expand upon the above capabilities to include higher

                                                                                      level functionality fixed into the silicon Having these common functions

                                                                                      embedded into the silicon reduces the area required and gives those functions

                                                                                      increased speed compared to building them from primitives Examples of these

                                                                                      include multipliers generic DSP blocks embedded processors high speed IO

                                                                                      logic and embedded memories

                                                                                      61

                                                                                      FPGAs are also widely used for systems validation including pre-silicon

                                                                                      validation post-silicon validation and firmware development This allows chip

                                                                                      companies to validate their design before the chip is produced in the factory

                                                                                      reducing the time to market

                                                                                      124 FPGA Design and Programming

                                                                                      To define the behavior of the FPGA the user provides a hardware description

                                                                                      language (HDL) or a schematic design The HDL form might be easier to work

                                                                                      with when handling large structures because its possible to just specify them

                                                                                      numerically rather than having to draw every piece by hand On the other hand

                                                                                      schematic entry can allow for easier visualisation of a design

                                                                                      Then using an electronic design automation tool a technology-mapped netlist

                                                                                      is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                      using a process called place-and-route usually performed by the FPGA

                                                                                      companys proprietary place-and-route software The user will validate the

                                                                                      map place and route results via timing analysis simulation and other

                                                                                      verification methodologies Once the design and validation process is

                                                                                      complete the binary file generated (also using the FPGA companys

                                                                                      proprietary software) is used to (re)configure the FPGA

                                                                                      Going from schematicHDL source files to actual configuration The source

                                                                                      files are fed to a software suite from the FPGACPLD vendor that through

                                                                                      different steps will produce a file This file is then transferred to the

                                                                                      FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                      like an EEPROM

                                                                                      The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                      reduce the complexity of designing in HDLs which have been compared to the

                                                                                      equivalent of assembly languages there are moves to raise the abstraction level

                                                                                      through the introduction of alternative languages

                                                                                      62

                                                                                      To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                      predefined complex functions and circuits that have been tested and optimized

                                                                                      to speed up the design process These predefined circuits are commonly called

                                                                                      IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                      (rarely free and typically released under proprietary licenses) Other

                                                                                      predefined circuits are available from developer communities such as Open

                                                                                      Cores (typically free and released under the GPL BSD or similar license) and

                                                                                      other sources

                                                                                      In a typical design flow an FPGA application developer will simulate the

                                                                                      design at multiple stages throughout the design process Initially the RTL

                                                                                      description in VHDL or Verilog is simulated by creating test benches to

                                                                                      simulate the system and observe results Then after the synthesis engine has

                                                                                      mapped the design to a netlist the netlist is translated to a gate level

                                                                                      description where simulation is repeated to confirm the synthesis proceeded

                                                                                      without errors Finally the design is laid out in the FPGA at which point

                                                                                      propagation delays can be added and the simulation run again with these

                                                                                      values back-annotated onto the netlist

                                                                                      Basic Process Technology Types

                                                                                      SRAM - based on static memory technology In-system programmable and re-

                                                                                      programmable Requires external boot devices CMOS

                                                                                      Antifuse - One-time programmable CMOS

                                                                                      EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                      one-time programmable in production because of plastic packaging

                                                                                      Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                      EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                      technology Can be erased even in plastic packages Some but not all

                                                                                      EEPROM devices can be in-system programmed CMOS

                                                                                      Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                      packages Some but not all flash devices can be in-system programmed

                                                                                      63

                                                                                      Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                      therefore less expensive to manufacture CMOS

                                                                                      Fuse - One-time programmable Bipolar

                                                                                      Major Manufacturers

                                                                                      Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                      rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                      representing over 50 percent

                                                                                      Xilinx also provides free Windows and Linux design software while Altera

                                                                                      provides free Windows tools the Solaris and Linux tools are only available via

                                                                                      a rental scheme

                                                                                      Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                      (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                      Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                      focused CSSP no general purpose FPGAs)

                                                                                      125 FPGA prototype

                                                                                      FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                      prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                      hardware verification and early software development

                                                                                      Main stream verification methods for hardware design and early software and

                                                                                      firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                      design on FPGA has become a good method to do this

                                                                                      64

                                                                                      Reasons why Prototyping is important

                                                                                      1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                      functionally correct This is compared to designers only relying on software

                                                                                      simulations to verify that their hardware design is sound Simulation speed and

                                                                                      modeling accuracy limitations hinder this development

                                                                                      2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                      the foundry to start on software tests FPGA prototyping allows for much more

                                                                                      time in area of software development and testing at the software-hardware

                                                                                      integration stage This allows many unforeseen software bugs that appear due

                                                                                      to todays array of operating systems applications and hardware

                                                                                      3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                      system work well together off the simulation stage and in actual form

                                                                                      4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                      bringing in interest early This speeds up the overall development cycle and

                                                                                      allows for more enhancement or improvement to the chip features as it would

                                                                                      otherwise have been

                                                                                      65

                                                                                      13 SIMULATION RESULTS

                                                                                      66

                                                                                      GATE LEVEL

                                                                                      FigVGA CONTROLLER

                                                                                      67

                                                                                      GATE LEVEL

                                                                                      Fig SYNC UNIT

                                                                                      68

                                                                                      TECHNOLOGY SCHEMATIC

                                                                                      69

                                                                                      14 APPLICATIONS

                                                                                      Motion Tracking Systems

                                                                                      Image processing systems

                                                                                      Displaying Systems

                                                                                      141 CONCLUTION

                                                                                      Hence by implementing FPGA taking a processor-generated picture

                                                                                      (pixilated) from memory space the Controller provides digital RGB values for

                                                                                      each pixel as well as horizontal and vertical synchronization signals in order

                                                                                      to correctly display the picture on a connected monitor

                                                                                      70

                                                                                      REFERENCES

                                                                                      1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                      2 A VGA display controller by Eduardo Sanchez

                                                                                      3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                      71

                                                                                      • Pixel
                                                                                        • 6 SIGNALS IN VGA CONTROLLER
                                                                                        • Gates
                                                                                        • Market size
                                                                                        • 122 FPGA Comparisons

                                                                                        signal clkhsync_svsync_sh_en_sv_en_sv_en_reg_srd_sstd_logic

                                                                                        signal line_count_sc_hs_saddrinteger

                                                                                        signal data_sstd_logic_vector(0 downto 0)

                                                                                        signal addr_sstd_logic_vector(17 downto 0)

                                                                                        begin

                                                                                        --addr_s lt= std_logic_vector(to_unsigned(addr18))

                                                                                        process(clk50mrstn)

                                                                                        begin

                                                                                        if(rstn = 0)then

                                                                                        clk lt= 0

                                                                                        elsif(clk50m = 1 and clk50mevent)then

                                                                                        clk lt= not clk

                                                                                        end if

                                                                                        --end if

                                                                                        end process

                                                                                        hsync lt= hsync_s

                                                                                        vsync lt= vsync_s

                                                                                        sync sync_unit port map(

                                                                                        clk =gt clk

                                                                                        rstn =gt rstn

                                                                                        hsync =gt hsync_s

                                                                                        vsync =gt vsync_s

                                                                                        v_en_reg =gt v_en_reg_s

                                                                                        line_count =gt line_count_s

                                                                                        h_en =gt h_en_s

                                                                                        v_en =gt v_en_s

                                                                                        c_hs =gt c_hs_s

                                                                                        )

                                                                                        44

                                                                                        pixel pixelgen port map(

                                                                                        clk =gt clk

                                                                                        rstn =gt rstn

                                                                                        hsync =gt hsync_s

                                                                                        vsync =gt vsync_s

                                                                                        h_en =gt h_en_s

                                                                                        v_en =gt v_en_s

                                                                                        v_en_reg =gt v_en_reg_s

                                                                                        data_in =gt data_s

                                                                                        line_count=gt line_count_s

                                                                                        c_hs =gt c_hs_s

                                                                                        red =gt red

                                                                                        green =gt green

                                                                                        blue =gt blue

                                                                                        addr =gt addr_s

                                                                                        rd =gt rd_s

                                                                                        )

                                                                                        mem12 memory port map(

                                                                                        addr =gt addr_s

                                                                                        clk =gt clk

                                                                                        dout =gt data_s

                                                                                        en =gt rd_s

                                                                                        sinit=gt rstn

                                                                                        )

                                                                                        end Behavioral

                                                                                        45

                                                                                        SYNCHRONIZATION BLOCK CODE

                                                                                        library IEEE

                                                                                        use IEEESTD_LOGIC_1164ALL

                                                                                        use IEEESTD_LOGIC_ARITHALL

                                                                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                        entity sync_unit is port(

                                                                                        clkin std_logic

                                                                                        rstnin std_logic

                                                                                        hsyncout std_logic ---horizontal synch pulse(row)

                                                                                        vsyncout std_logic --vertical sync

                                                                                        pulse(frame) 111110100000000000

                                                                                        v_en_regout std_logic

                                                                                        h_enout std_logic

                                                                                        v_enout std_logic

                                                                                        line_countout integer

                                                                                        c_hsout integer )

                                                                                        end sync_unit

                                                                                        architecture arch_sync_unit of sync_unit is

                                                                                        signal h_en_sv_en_sstd_logic

                                                                                        signal line_count_sc_hs_sinteger

                                                                                        begin

                                                                                        c_hs lt= c_hs_s

                                                                                        line_count lt= line_count_s

                                                                                        h_en lt= h_en_s

                                                                                        v_en lt= v_en_s

                                                                                        -- hsync counter

                                                                                        process(clkrstn)

                                                                                        begin

                                                                                        if(rstn = 0)then

                                                                                        46

                                                                                        c_hs_s lt= 0

                                                                                        else

                                                                                        if(clk = 1 and clkevent)then

                                                                                        if(c_hs_s=793)then

                                                                                        c_hs_s lt= 0

                                                                                        else

                                                                                        c_hs_s lt= c_hs_s +1

                                                                                        end if

                                                                                        end if

                                                                                        end if

                                                                                        end process

                                                                                        --vsync counter

                                                                                        --vertical line counter

                                                                                        process(h_en_srstn)

                                                                                        begin

                                                                                        if(rstn = 0)then

                                                                                        line_count_s lt= 0

                                                                                        elsif(h_en_s = 1 and h_en_sevent)then

                                                                                        line_count_s lt= line_count_s + 1

                                                                                        if (line_count_s=515)then

                                                                                        line_count_s lt= 0

                                                                                        end if

                                                                                        end if

                                                                                        --end if

                                                                                        end process

                                                                                        --hysnc pulse

                                                                                        process(clkrstn)

                                                                                        begin

                                                                                        if(rstn = 0)then

                                                                                        47

                                                                                        hsync lt= 1

                                                                                        elsif(clk = 1 and clkevent)then

                                                                                        if(c_hs_s lt=95)then

                                                                                        hsync lt= 0

                                                                                        else

                                                                                        hsync lt= 1

                                                                                        end if

                                                                                        end if

                                                                                        --end if

                                                                                        end process

                                                                                        process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                                        this period)

                                                                                        begin

                                                                                        if(rstn = 0)then

                                                                                        h_en_s lt= 0

                                                                                        elsif(clk = 1 and clkevent)then

                                                                                        if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                                        h_en_s lt= 1

                                                                                        else

                                                                                        h_en_s lt= 0

                                                                                        end if

                                                                                        end if

                                                                                        --end if

                                                                                        end process

                                                                                        ---vysnc pulse

                                                                                        process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                                        begin

                                                                                        if(rstn = 0)then

                                                                                        48

                                                                                        v_en_s lt= 0

                                                                                        elsif(clk = 1 and clkevent)then

                                                                                        if((line_count_sgt=35 and line_count_slt=515))then

                                                                                        v_en_s lt= 1

                                                                                        else

                                                                                        v_en_s lt= 0

                                                                                        end if

                                                                                        end if

                                                                                        --end if

                                                                                        end process

                                                                                        process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                        begin

                                                                                        if(rstn = 0)then

                                                                                        vsync lt= 1

                                                                                        elsif(clk = 1 and clkevent)then

                                                                                        if(line_count_s lt2)then

                                                                                        vsync lt= 0

                                                                                        else

                                                                                        vsync lt= 1

                                                                                        end if

                                                                                        end if

                                                                                        --end if

                                                                                        end process

                                                                                        process(clk)

                                                                                        begin

                                                                                        if(clk = 1 and clkevent)then

                                                                                        v_en_reglt=v_en_s

                                                                                        end if

                                                                                        end process

                                                                                        end arch_sync_unit

                                                                                        49

                                                                                        PIXEL GENERATOR

                                                                                        library IEEE

                                                                                        use IEEESTD_LOGIC_1164ALL

                                                                                        use IEEESTD_LOGIC_ARITHALL

                                                                                        use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                        entity pixelgen is port(

                                                                                        clkin std_logic

                                                                                        rstnin std_logic

                                                                                        hsyncin std_logic

                                                                                        vsyncin std_logic

                                                                                        h_enin std_logic

                                                                                        v_enin std_logic

                                                                                        v_en_reg in std_logic

                                                                                        data_inin std_logic_vector(0 downto 0)

                                                                                        line_countin integer

                                                                                        c_hsin integer

                                                                                        red out std_logic_vector(0 downto 0)

                                                                                        greenout std_logic_vector(0 downto 0)

                                                                                        blueout std_logic_vector(0 downto 0)

                                                                                        addrout std_logic_vector(17 downto 0)

                                                                                        rdout std_logic

                                                                                        )

                                                                                        end pixelgen

                                                                                        architecture behv of pixelgen is

                                                                                        signal addr_sstd_logic_vector(17 downto 0)

                                                                                        begin

                                                                                        addr lt= addr_s

                                                                                        process(clkrstn)

                                                                                        begin

                                                                                        50

                                                                                        if(rstn = 0)then

                                                                                        rd lt= 0

                                                                                        red lt= 0

                                                                                        green lt= 0

                                                                                        blue lt= 0

                                                                                        addr_s lt= x0000 amp 00

                                                                                        elsif (clk = 1 and clkevent)then

                                                                                        if((v_en=1) and (v_en_reg=0))then

                                                                                        addr_slt=(others =gt0)

                                                                                        end if

                                                                                        if(v_en=1 and h_en=1)then

                                                                                        if(line_countlt=514)then

                                                                                        if(c_hs lt= 655)then

                                                                                        rd lt= 1

                                                                                        red lt= data_in

                                                                                        green lt= data_in

                                                                                        blue lt= data_in

                                                                                        addr_s lt=addr_s +000000000000000001

                                                                                        else

                                                                                        rd lt= 0

                                                                                        red lt= 0

                                                                                        green lt= 1

                                                                                        blue lt= 0

                                                                                        end if

                                                                                        else

                                                                                        rd lt= 0

                                                                                        red lt= 1

                                                                                        green lt= 1

                                                                                        blue lt= 1

                                                                                        51

                                                                                        end if

                                                                                        if(addr_s=111011111111111111)then

                                                                                        addr_slt= (others =gt0)

                                                                                        end if

                                                                                        else

                                                                                        red lt= 0

                                                                                        green lt= 0

                                                                                        blue lt= 0

                                                                                        end if

                                                                                        end if

                                                                                        -- end if

                                                                                        --end if

                                                                                        end process

                                                                                        end behv

                                                                                        52

                                                                                        12 FPGA ARCHITECTURE

                                                                                        A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                        be configured by the customer or designer after manufacturingmdashhence the

                                                                                        name field-programmable FPGAs are programmed using a logic circuit

                                                                                        diagram or a source code in a hardware description language (HDL) to specify

                                                                                        how the chip will work They can be used to implement any logical function

                                                                                        that an application-specific integrated circuit (ASIC) could perform but the

                                                                                        ability to update the functionality after shipping offers advantages for many

                                                                                        applications

                                                                                        FPGAs contain programmable logic components called logic blocks and a

                                                                                        hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                        togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                        can be configured to perform complex combinational functions or merely

                                                                                        simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                        include memory elements which may be simple flip-flops or more complete

                                                                                        blocks of memory

                                                                                        History The FPGA industry sprouted from programmable read only

                                                                                        memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                        both had the option of being programmed in batches in a factory or in the field

                                                                                        (field programmable) however programmable logic was hard-wired between

                                                                                        logic gates

                                                                                        Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                        first commercially viable field programmable gate array in 1985 ndash the

                                                                                        XC2064 The XC2064 had programmable gates and programmable

                                                                                        interconnects between gates the beginnings of a new technology and market

                                                                                        The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                        input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                        into the National Inventors Hall of Fame for his invention

                                                                                        53

                                                                                        Some of the industryrsquos foundational concepts and technologies for

                                                                                        programmable logic arrays gates and logic blocks are founded in patents

                                                                                        awarded to David W Page and LuVerne R Peterson in 1985

                                                                                        In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                        proposed by Steve Casselman to develop a computer that would implement

                                                                                        600000 reprogrammable gates Casselman was successful and the system was

                                                                                        awarded a patent in 1992

                                                                                        Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                        1990s when competitors sprouted up eroding significant market-share By

                                                                                        1993 Actel was serving about 18 percent of the market

                                                                                        The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                        and the volume of production In the early 1990s FPGAs were primarily used

                                                                                        in telecommunications and networking By the end of the decade FPGAs

                                                                                        found their way into consumer automotive and industrial applications

                                                                                        FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                        algorithm technology and FPGAs to create a sound recognition device

                                                                                        Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                        to decide the configuration needed to accomplish a sound recognition task

                                                                                        121 Modern developments

                                                                                        A recent trend has been to take the coarse-grained architectural

                                                                                        approach a step further by combining the logic blocks and interconnects of

                                                                                        traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                        form a complete system on a programmable chip This work mirrors the

                                                                                        architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                        Group which combined a reconfigurable CPU architecture on a single chip

                                                                                        called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                        technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                        which include one or more PowerPC processors embedded within the FPGAs

                                                                                        54

                                                                                        logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                        processor in combination with Atmels programmable logic architecture

                                                                                        An alternate approach to using hard-macro processors is to make use of soft

                                                                                        processor cores that are implemented within the FPGA logic (See Soft

                                                                                        processors below)

                                                                                        As previously mentioned many modern FPGAs have the ability to be

                                                                                        reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                        computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                        suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                        example of a reconfigurable soft processor implemented on FPGAs However

                                                                                        it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                        to a specific program

                                                                                        Additionally new non-FPGA architectures are beginning to emerge

                                                                                        Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                        hybrid approach by providing an array of processor cores and FPGA-like

                                                                                        programmable cores on the same chip

                                                                                        Gates

                                                                                        1987 9000 gates Xilinx

                                                                                        1992 600000 Naval Surface Warfare Department

                                                                                        Early 2000s Millions

                                                                                        Market size

                                                                                        1985 First commercial FPGA technology invented by Xilinx

                                                                                        1987 $14 million

                                                                                        ~1993 gt$385 million

                                                                                        2005 $19 billion

                                                                                        2010 estimates $275 billion

                                                                                        55

                                                                                        122 FPGA Comparisons

                                                                                        Historically FPGAs have been slower less energy efficient and generally

                                                                                        achieved less functionality than their fixed ASIC counterparts A combination

                                                                                        of volume fabrication improvements research and development and the IO

                                                                                        capabilities of new supercomputers have largely closed the performance gap

                                                                                        between ASICs and FPGAs

                                                                                        Advantages include a shorter time to market ability to re-program in the field

                                                                                        to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                        middle road by developing their hardware on ordinary FPGAs but

                                                                                        manufacture their final version so it can no longer be modified after the design

                                                                                        has been committed

                                                                                        Xilinx claims that several market and technology dynamics are changing the

                                                                                        ASICFPGA paradigm

                                                                                        IC costs are rising aggressively

                                                                                        ASIC complexity has bolstered development time and costs

                                                                                        RampD resources and headcount is decreasing

                                                                                        Revenue losses for slow time-to-market are increasing

                                                                                        Financial constraints in a poor economy are driving low-cost technologies

                                                                                        These trends make FPGAs a better alternative than ASICs for a growing

                                                                                        number of higher-volume applications than they have been historically used

                                                                                        for which the company blames for the growing number of FPGA design starts

                                                                                        (see History)

                                                                                        The primary differences between CPLDs and FPGAs are architectural A

                                                                                        CPLD has a somewhat restrictive structure consisting of one or more

                                                                                        programmable sum-of-products logic arrays feeding a relatively small number

                                                                                        of clocked registers The result of this is less flexibility with the advantage of

                                                                                        more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                        FPGA architectures on the other hand are dominated by interconnect This

                                                                                        56

                                                                                        makes them far more flexible (in terms of the range of designs that are

                                                                                        practical for implementation within them) but also far more complex to design

                                                                                        for

                                                                                        Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                        FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                        and embedded memories as well as to have logic blocks implement decoders

                                                                                        or mathematical functions

                                                                                        Some FPGAs have the capability of partial re-configuration that lets one

                                                                                        portion of the device be re-programmed while other portions continue running

                                                                                        123 Applications

                                                                                        Applications of FPGAs include digital signal processing software-defined

                                                                                        radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                        computer vision speech recognition cryptography bioinformatics computer

                                                                                        hardware emulation radio astronomy and a growing range of other areas

                                                                                        FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                        space that of glue logic for PCBs As their size capabilities and speed

                                                                                        increased they began to take over larger and larger functions to the state where

                                                                                        some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                        introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                        applications which had traditionally been the sole reserve of DSPs began to

                                                                                        incorporate FPGAs instead

                                                                                        FPGAs especially find applications in any area or algorithm that can make use

                                                                                        of the massive parallelism offered by their architecture One such area is code

                                                                                        breaking in particular brute-force attack of cryptographic algorithms

                                                                                        FPGAs are increasingly used in conventional high performance computing

                                                                                        applications where computational kernels such as FFT or Convolution are

                                                                                        performed on the FPGA instead of a microprocessor

                                                                                        57

                                                                                        The inherent parallelism of the logic resources on an FPGA allows for

                                                                                        considerable computational throughput even at a low MHz clock rates The

                                                                                        flexibility of the FPGA allows for even higher performance by trading off

                                                                                        precision and range in the number format for an increased number of parallel

                                                                                        arithmetic units This has driven a new type of processing called

                                                                                        reconfigurable computing where time intensive tasks are offloaded from

                                                                                        software to FPGAs

                                                                                        The adoption of FPGAs in high performance computing is currently limited by

                                                                                        the complexity of FPGA design compared to conventional software and the

                                                                                        extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                        is necessary after even minor changes to the source code

                                                                                        Traditionally FPGAs have been reserved for specific vertical applications

                                                                                        where the volume of production is small For these low-volume applications

                                                                                        the premium that companies pay in hardware costs per unit for a

                                                                                        programmable chip is more affordable than the development resources spent

                                                                                        on creating an ASIC for a low-volume application Today new cost and

                                                                                        performance dynamics have broadened the range of viable applications

                                                                                        Architecture

                                                                                        The most common FPGA architecture consists of an array of configurable

                                                                                        logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                        channels have the same width (number of wires) Multiple IO pads may fit

                                                                                        into the height of one row or the width of one column in the array

                                                                                        An application circuit must be mapped into an FPGA with adequate resources

                                                                                        While the number of CLBs and IOs required is easily determined from the

                                                                                        design the number of routing tracks needed may vary considerably even

                                                                                        among designs with the same amount of logic (For example a crossbar switch

                                                                                        requires much more routing than a systolic array with the same gate count)

                                                                                        Since unused routing tracks increase the cost (and decrease the performance)

                                                                                        of the part without providing any benefit FPGA manufacturers try to provide

                                                                                        58

                                                                                        just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                        can be routed This is determined by estimates such as those derived from

                                                                                        Rents rule or by experiments with existing designs

                                                                                        The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                        blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                        width of one column as shown below All the routing channels have the same

                                                                                        width (number of wires)

                                                                                        121 FPGA structure

                                                                                        A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                        flip-flop as shown below In recent years manufacturers have started moving

                                                                                        to 6-input LUTs in their high performance parts claiming increased

                                                                                        performance

                                                                                        Typical logic block

                                                                                        59

                                                                                        There is only one output which can be either the registered or the unregistered

                                                                                        LUT output The logic block has four inputs for the LUT and a clock input

                                                                                        Since clock signals (and often other high-fanout signals) are normally routed

                                                                                        via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                        and other signals are separately managed

                                                                                        For this example architecture the locations of the FPGA logic block pins are

                                                                                        shown below

                                                                                        Logic Block Pin Locations

                                                                                        Each input is accessible from one side of the logic block while the output pin

                                                                                        can connect to routing wires in both the channel to the right and the channel

                                                                                        below the logic block

                                                                                        Each logic block output pin can connect to any of the wiring segments in the

                                                                                        channels adjacent to it

                                                                                        Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                        channel adjacent to it For example an IO pad at the top of the chip can

                                                                                        connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                        channel immediately below it

                                                                                        Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                        spans only one logic block before it terminates in a switch box By turning on

                                                                                        60

                                                                                        some of the programmable switches within a switch box longer paths can be

                                                                                        constructed For higher speed interconnect some FPGA architectures use

                                                                                        longer routing lines that span multiple logic blocks

                                                                                        Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                        this architecture when a wire enters a switch box there are three

                                                                                        programmable switches that allow it to connect to three other wires in adjacent

                                                                                        channel segments The pattern or topology of switches used in this

                                                                                        architecture is the planar or domain-based switch box topology In this switch

                                                                                        box topology a wire in track number one connects only to wires in track

                                                                                        number one in adjacent channel segments wires in track number 2 connect

                                                                                        only to other wires in track number 2 and so on The figure below illustrates

                                                                                        the connections in a switch box

                                                                                        Switch box topology

                                                                                        Modern FPGA families expand upon the above capabilities to include higher

                                                                                        level functionality fixed into the silicon Having these common functions

                                                                                        embedded into the silicon reduces the area required and gives those functions

                                                                                        increased speed compared to building them from primitives Examples of these

                                                                                        include multipliers generic DSP blocks embedded processors high speed IO

                                                                                        logic and embedded memories

                                                                                        61

                                                                                        FPGAs are also widely used for systems validation including pre-silicon

                                                                                        validation post-silicon validation and firmware development This allows chip

                                                                                        companies to validate their design before the chip is produced in the factory

                                                                                        reducing the time to market

                                                                                        124 FPGA Design and Programming

                                                                                        To define the behavior of the FPGA the user provides a hardware description

                                                                                        language (HDL) or a schematic design The HDL form might be easier to work

                                                                                        with when handling large structures because its possible to just specify them

                                                                                        numerically rather than having to draw every piece by hand On the other hand

                                                                                        schematic entry can allow for easier visualisation of a design

                                                                                        Then using an electronic design automation tool a technology-mapped netlist

                                                                                        is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                        using a process called place-and-route usually performed by the FPGA

                                                                                        companys proprietary place-and-route software The user will validate the

                                                                                        map place and route results via timing analysis simulation and other

                                                                                        verification methodologies Once the design and validation process is

                                                                                        complete the binary file generated (also using the FPGA companys

                                                                                        proprietary software) is used to (re)configure the FPGA

                                                                                        Going from schematicHDL source files to actual configuration The source

                                                                                        files are fed to a software suite from the FPGACPLD vendor that through

                                                                                        different steps will produce a file This file is then transferred to the

                                                                                        FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                        like an EEPROM

                                                                                        The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                        reduce the complexity of designing in HDLs which have been compared to the

                                                                                        equivalent of assembly languages there are moves to raise the abstraction level

                                                                                        through the introduction of alternative languages

                                                                                        62

                                                                                        To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                        predefined complex functions and circuits that have been tested and optimized

                                                                                        to speed up the design process These predefined circuits are commonly called

                                                                                        IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                        (rarely free and typically released under proprietary licenses) Other

                                                                                        predefined circuits are available from developer communities such as Open

                                                                                        Cores (typically free and released under the GPL BSD or similar license) and

                                                                                        other sources

                                                                                        In a typical design flow an FPGA application developer will simulate the

                                                                                        design at multiple stages throughout the design process Initially the RTL

                                                                                        description in VHDL or Verilog is simulated by creating test benches to

                                                                                        simulate the system and observe results Then after the synthesis engine has

                                                                                        mapped the design to a netlist the netlist is translated to a gate level

                                                                                        description where simulation is repeated to confirm the synthesis proceeded

                                                                                        without errors Finally the design is laid out in the FPGA at which point

                                                                                        propagation delays can be added and the simulation run again with these

                                                                                        values back-annotated onto the netlist

                                                                                        Basic Process Technology Types

                                                                                        SRAM - based on static memory technology In-system programmable and re-

                                                                                        programmable Requires external boot devices CMOS

                                                                                        Antifuse - One-time programmable CMOS

                                                                                        EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                        one-time programmable in production because of plastic packaging

                                                                                        Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                        EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                        technology Can be erased even in plastic packages Some but not all

                                                                                        EEPROM devices can be in-system programmed CMOS

                                                                                        Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                        packages Some but not all flash devices can be in-system programmed

                                                                                        63

                                                                                        Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                        therefore less expensive to manufacture CMOS

                                                                                        Fuse - One-time programmable Bipolar

                                                                                        Major Manufacturers

                                                                                        Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                        rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                        representing over 50 percent

                                                                                        Xilinx also provides free Windows and Linux design software while Altera

                                                                                        provides free Windows tools the Solaris and Linux tools are only available via

                                                                                        a rental scheme

                                                                                        Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                        (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                        Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                        focused CSSP no general purpose FPGAs)

                                                                                        125 FPGA prototype

                                                                                        FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                        prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                        hardware verification and early software development

                                                                                        Main stream verification methods for hardware design and early software and

                                                                                        firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                        design on FPGA has become a good method to do this

                                                                                        64

                                                                                        Reasons why Prototyping is important

                                                                                        1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                        functionally correct This is compared to designers only relying on software

                                                                                        simulations to verify that their hardware design is sound Simulation speed and

                                                                                        modeling accuracy limitations hinder this development

                                                                                        2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                        the foundry to start on software tests FPGA prototyping allows for much more

                                                                                        time in area of software development and testing at the software-hardware

                                                                                        integration stage This allows many unforeseen software bugs that appear due

                                                                                        to todays array of operating systems applications and hardware

                                                                                        3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                        system work well together off the simulation stage and in actual form

                                                                                        4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                        bringing in interest early This speeds up the overall development cycle and

                                                                                        allows for more enhancement or improvement to the chip features as it would

                                                                                        otherwise have been

                                                                                        65

                                                                                        13 SIMULATION RESULTS

                                                                                        66

                                                                                        GATE LEVEL

                                                                                        FigVGA CONTROLLER

                                                                                        67

                                                                                        GATE LEVEL

                                                                                        Fig SYNC UNIT

                                                                                        68

                                                                                        TECHNOLOGY SCHEMATIC

                                                                                        69

                                                                                        14 APPLICATIONS

                                                                                        Motion Tracking Systems

                                                                                        Image processing systems

                                                                                        Displaying Systems

                                                                                        141 CONCLUTION

                                                                                        Hence by implementing FPGA taking a processor-generated picture

                                                                                        (pixilated) from memory space the Controller provides digital RGB values for

                                                                                        each pixel as well as horizontal and vertical synchronization signals in order

                                                                                        to correctly display the picture on a connected monitor

                                                                                        70

                                                                                        REFERENCES

                                                                                        1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                        2 A VGA display controller by Eduardo Sanchez

                                                                                        3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                        71

                                                                                        • Pixel
                                                                                          • 6 SIGNALS IN VGA CONTROLLER
                                                                                          • Gates
                                                                                          • Market size
                                                                                          • 122 FPGA Comparisons

                                                                                          pixel pixelgen port map(

                                                                                          clk =gt clk

                                                                                          rstn =gt rstn

                                                                                          hsync =gt hsync_s

                                                                                          vsync =gt vsync_s

                                                                                          h_en =gt h_en_s

                                                                                          v_en =gt v_en_s

                                                                                          v_en_reg =gt v_en_reg_s

                                                                                          data_in =gt data_s

                                                                                          line_count=gt line_count_s

                                                                                          c_hs =gt c_hs_s

                                                                                          red =gt red

                                                                                          green =gt green

                                                                                          blue =gt blue

                                                                                          addr =gt addr_s

                                                                                          rd =gt rd_s

                                                                                          )

                                                                                          mem12 memory port map(

                                                                                          addr =gt addr_s

                                                                                          clk =gt clk

                                                                                          dout =gt data_s

                                                                                          en =gt rd_s

                                                                                          sinit=gt rstn

                                                                                          )

                                                                                          end Behavioral

                                                                                          45

                                                                                          SYNCHRONIZATION BLOCK CODE

                                                                                          library IEEE

                                                                                          use IEEESTD_LOGIC_1164ALL

                                                                                          use IEEESTD_LOGIC_ARITHALL

                                                                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                          entity sync_unit is port(

                                                                                          clkin std_logic

                                                                                          rstnin std_logic

                                                                                          hsyncout std_logic ---horizontal synch pulse(row)

                                                                                          vsyncout std_logic --vertical sync

                                                                                          pulse(frame) 111110100000000000

                                                                                          v_en_regout std_logic

                                                                                          h_enout std_logic

                                                                                          v_enout std_logic

                                                                                          line_countout integer

                                                                                          c_hsout integer )

                                                                                          end sync_unit

                                                                                          architecture arch_sync_unit of sync_unit is

                                                                                          signal h_en_sv_en_sstd_logic

                                                                                          signal line_count_sc_hs_sinteger

                                                                                          begin

                                                                                          c_hs lt= c_hs_s

                                                                                          line_count lt= line_count_s

                                                                                          h_en lt= h_en_s

                                                                                          v_en lt= v_en_s

                                                                                          -- hsync counter

                                                                                          process(clkrstn)

                                                                                          begin

                                                                                          if(rstn = 0)then

                                                                                          46

                                                                                          c_hs_s lt= 0

                                                                                          else

                                                                                          if(clk = 1 and clkevent)then

                                                                                          if(c_hs_s=793)then

                                                                                          c_hs_s lt= 0

                                                                                          else

                                                                                          c_hs_s lt= c_hs_s +1

                                                                                          end if

                                                                                          end if

                                                                                          end if

                                                                                          end process

                                                                                          --vsync counter

                                                                                          --vertical line counter

                                                                                          process(h_en_srstn)

                                                                                          begin

                                                                                          if(rstn = 0)then

                                                                                          line_count_s lt= 0

                                                                                          elsif(h_en_s = 1 and h_en_sevent)then

                                                                                          line_count_s lt= line_count_s + 1

                                                                                          if (line_count_s=515)then

                                                                                          line_count_s lt= 0

                                                                                          end if

                                                                                          end if

                                                                                          --end if

                                                                                          end process

                                                                                          --hysnc pulse

                                                                                          process(clkrstn)

                                                                                          begin

                                                                                          if(rstn = 0)then

                                                                                          47

                                                                                          hsync lt= 1

                                                                                          elsif(clk = 1 and clkevent)then

                                                                                          if(c_hs_s lt=95)then

                                                                                          hsync lt= 0

                                                                                          else

                                                                                          hsync lt= 1

                                                                                          end if

                                                                                          end if

                                                                                          --end if

                                                                                          end process

                                                                                          process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                                          this period)

                                                                                          begin

                                                                                          if(rstn = 0)then

                                                                                          h_en_s lt= 0

                                                                                          elsif(clk = 1 and clkevent)then

                                                                                          if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                                          h_en_s lt= 1

                                                                                          else

                                                                                          h_en_s lt= 0

                                                                                          end if

                                                                                          end if

                                                                                          --end if

                                                                                          end process

                                                                                          ---vysnc pulse

                                                                                          process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                                          begin

                                                                                          if(rstn = 0)then

                                                                                          48

                                                                                          v_en_s lt= 0

                                                                                          elsif(clk = 1 and clkevent)then

                                                                                          if((line_count_sgt=35 and line_count_slt=515))then

                                                                                          v_en_s lt= 1

                                                                                          else

                                                                                          v_en_s lt= 0

                                                                                          end if

                                                                                          end if

                                                                                          --end if

                                                                                          end process

                                                                                          process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                          begin

                                                                                          if(rstn = 0)then

                                                                                          vsync lt= 1

                                                                                          elsif(clk = 1 and clkevent)then

                                                                                          if(line_count_s lt2)then

                                                                                          vsync lt= 0

                                                                                          else

                                                                                          vsync lt= 1

                                                                                          end if

                                                                                          end if

                                                                                          --end if

                                                                                          end process

                                                                                          process(clk)

                                                                                          begin

                                                                                          if(clk = 1 and clkevent)then

                                                                                          v_en_reglt=v_en_s

                                                                                          end if

                                                                                          end process

                                                                                          end arch_sync_unit

                                                                                          49

                                                                                          PIXEL GENERATOR

                                                                                          library IEEE

                                                                                          use IEEESTD_LOGIC_1164ALL

                                                                                          use IEEESTD_LOGIC_ARITHALL

                                                                                          use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                          entity pixelgen is port(

                                                                                          clkin std_logic

                                                                                          rstnin std_logic

                                                                                          hsyncin std_logic

                                                                                          vsyncin std_logic

                                                                                          h_enin std_logic

                                                                                          v_enin std_logic

                                                                                          v_en_reg in std_logic

                                                                                          data_inin std_logic_vector(0 downto 0)

                                                                                          line_countin integer

                                                                                          c_hsin integer

                                                                                          red out std_logic_vector(0 downto 0)

                                                                                          greenout std_logic_vector(0 downto 0)

                                                                                          blueout std_logic_vector(0 downto 0)

                                                                                          addrout std_logic_vector(17 downto 0)

                                                                                          rdout std_logic

                                                                                          )

                                                                                          end pixelgen

                                                                                          architecture behv of pixelgen is

                                                                                          signal addr_sstd_logic_vector(17 downto 0)

                                                                                          begin

                                                                                          addr lt= addr_s

                                                                                          process(clkrstn)

                                                                                          begin

                                                                                          50

                                                                                          if(rstn = 0)then

                                                                                          rd lt= 0

                                                                                          red lt= 0

                                                                                          green lt= 0

                                                                                          blue lt= 0

                                                                                          addr_s lt= x0000 amp 00

                                                                                          elsif (clk = 1 and clkevent)then

                                                                                          if((v_en=1) and (v_en_reg=0))then

                                                                                          addr_slt=(others =gt0)

                                                                                          end if

                                                                                          if(v_en=1 and h_en=1)then

                                                                                          if(line_countlt=514)then

                                                                                          if(c_hs lt= 655)then

                                                                                          rd lt= 1

                                                                                          red lt= data_in

                                                                                          green lt= data_in

                                                                                          blue lt= data_in

                                                                                          addr_s lt=addr_s +000000000000000001

                                                                                          else

                                                                                          rd lt= 0

                                                                                          red lt= 0

                                                                                          green lt= 1

                                                                                          blue lt= 0

                                                                                          end if

                                                                                          else

                                                                                          rd lt= 0

                                                                                          red lt= 1

                                                                                          green lt= 1

                                                                                          blue lt= 1

                                                                                          51

                                                                                          end if

                                                                                          if(addr_s=111011111111111111)then

                                                                                          addr_slt= (others =gt0)

                                                                                          end if

                                                                                          else

                                                                                          red lt= 0

                                                                                          green lt= 0

                                                                                          blue lt= 0

                                                                                          end if

                                                                                          end if

                                                                                          -- end if

                                                                                          --end if

                                                                                          end process

                                                                                          end behv

                                                                                          52

                                                                                          12 FPGA ARCHITECTURE

                                                                                          A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                          be configured by the customer or designer after manufacturingmdashhence the

                                                                                          name field-programmable FPGAs are programmed using a logic circuit

                                                                                          diagram or a source code in a hardware description language (HDL) to specify

                                                                                          how the chip will work They can be used to implement any logical function

                                                                                          that an application-specific integrated circuit (ASIC) could perform but the

                                                                                          ability to update the functionality after shipping offers advantages for many

                                                                                          applications

                                                                                          FPGAs contain programmable logic components called logic blocks and a

                                                                                          hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                          togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                          can be configured to perform complex combinational functions or merely

                                                                                          simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                          include memory elements which may be simple flip-flops or more complete

                                                                                          blocks of memory

                                                                                          History The FPGA industry sprouted from programmable read only

                                                                                          memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                          both had the option of being programmed in batches in a factory or in the field

                                                                                          (field programmable) however programmable logic was hard-wired between

                                                                                          logic gates

                                                                                          Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                          first commercially viable field programmable gate array in 1985 ndash the

                                                                                          XC2064 The XC2064 had programmable gates and programmable

                                                                                          interconnects between gates the beginnings of a new technology and market

                                                                                          The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                          input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                          into the National Inventors Hall of Fame for his invention

                                                                                          53

                                                                                          Some of the industryrsquos foundational concepts and technologies for

                                                                                          programmable logic arrays gates and logic blocks are founded in patents

                                                                                          awarded to David W Page and LuVerne R Peterson in 1985

                                                                                          In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                          proposed by Steve Casselman to develop a computer that would implement

                                                                                          600000 reprogrammable gates Casselman was successful and the system was

                                                                                          awarded a patent in 1992

                                                                                          Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                          1990s when competitors sprouted up eroding significant market-share By

                                                                                          1993 Actel was serving about 18 percent of the market

                                                                                          The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                          and the volume of production In the early 1990s FPGAs were primarily used

                                                                                          in telecommunications and networking By the end of the decade FPGAs

                                                                                          found their way into consumer automotive and industrial applications

                                                                                          FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                          algorithm technology and FPGAs to create a sound recognition device

                                                                                          Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                          to decide the configuration needed to accomplish a sound recognition task

                                                                                          121 Modern developments

                                                                                          A recent trend has been to take the coarse-grained architectural

                                                                                          approach a step further by combining the logic blocks and interconnects of

                                                                                          traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                          form a complete system on a programmable chip This work mirrors the

                                                                                          architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                          Group which combined a reconfigurable CPU architecture on a single chip

                                                                                          called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                          technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                          which include one or more PowerPC processors embedded within the FPGAs

                                                                                          54

                                                                                          logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                          processor in combination with Atmels programmable logic architecture

                                                                                          An alternate approach to using hard-macro processors is to make use of soft

                                                                                          processor cores that are implemented within the FPGA logic (See Soft

                                                                                          processors below)

                                                                                          As previously mentioned many modern FPGAs have the ability to be

                                                                                          reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                          computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                          suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                          example of a reconfigurable soft processor implemented on FPGAs However

                                                                                          it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                          to a specific program

                                                                                          Additionally new non-FPGA architectures are beginning to emerge

                                                                                          Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                          hybrid approach by providing an array of processor cores and FPGA-like

                                                                                          programmable cores on the same chip

                                                                                          Gates

                                                                                          1987 9000 gates Xilinx

                                                                                          1992 600000 Naval Surface Warfare Department

                                                                                          Early 2000s Millions

                                                                                          Market size

                                                                                          1985 First commercial FPGA technology invented by Xilinx

                                                                                          1987 $14 million

                                                                                          ~1993 gt$385 million

                                                                                          2005 $19 billion

                                                                                          2010 estimates $275 billion

                                                                                          55

                                                                                          122 FPGA Comparisons

                                                                                          Historically FPGAs have been slower less energy efficient and generally

                                                                                          achieved less functionality than their fixed ASIC counterparts A combination

                                                                                          of volume fabrication improvements research and development and the IO

                                                                                          capabilities of new supercomputers have largely closed the performance gap

                                                                                          between ASICs and FPGAs

                                                                                          Advantages include a shorter time to market ability to re-program in the field

                                                                                          to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                          middle road by developing their hardware on ordinary FPGAs but

                                                                                          manufacture their final version so it can no longer be modified after the design

                                                                                          has been committed

                                                                                          Xilinx claims that several market and technology dynamics are changing the

                                                                                          ASICFPGA paradigm

                                                                                          IC costs are rising aggressively

                                                                                          ASIC complexity has bolstered development time and costs

                                                                                          RampD resources and headcount is decreasing

                                                                                          Revenue losses for slow time-to-market are increasing

                                                                                          Financial constraints in a poor economy are driving low-cost technologies

                                                                                          These trends make FPGAs a better alternative than ASICs for a growing

                                                                                          number of higher-volume applications than they have been historically used

                                                                                          for which the company blames for the growing number of FPGA design starts

                                                                                          (see History)

                                                                                          The primary differences between CPLDs and FPGAs are architectural A

                                                                                          CPLD has a somewhat restrictive structure consisting of one or more

                                                                                          programmable sum-of-products logic arrays feeding a relatively small number

                                                                                          of clocked registers The result of this is less flexibility with the advantage of

                                                                                          more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                          FPGA architectures on the other hand are dominated by interconnect This

                                                                                          56

                                                                                          makes them far more flexible (in terms of the range of designs that are

                                                                                          practical for implementation within them) but also far more complex to design

                                                                                          for

                                                                                          Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                          FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                          and embedded memories as well as to have logic blocks implement decoders

                                                                                          or mathematical functions

                                                                                          Some FPGAs have the capability of partial re-configuration that lets one

                                                                                          portion of the device be re-programmed while other portions continue running

                                                                                          123 Applications

                                                                                          Applications of FPGAs include digital signal processing software-defined

                                                                                          radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                          computer vision speech recognition cryptography bioinformatics computer

                                                                                          hardware emulation radio astronomy and a growing range of other areas

                                                                                          FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                          space that of glue logic for PCBs As their size capabilities and speed

                                                                                          increased they began to take over larger and larger functions to the state where

                                                                                          some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                          introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                          applications which had traditionally been the sole reserve of DSPs began to

                                                                                          incorporate FPGAs instead

                                                                                          FPGAs especially find applications in any area or algorithm that can make use

                                                                                          of the massive parallelism offered by their architecture One such area is code

                                                                                          breaking in particular brute-force attack of cryptographic algorithms

                                                                                          FPGAs are increasingly used in conventional high performance computing

                                                                                          applications where computational kernels such as FFT or Convolution are

                                                                                          performed on the FPGA instead of a microprocessor

                                                                                          57

                                                                                          The inherent parallelism of the logic resources on an FPGA allows for

                                                                                          considerable computational throughput even at a low MHz clock rates The

                                                                                          flexibility of the FPGA allows for even higher performance by trading off

                                                                                          precision and range in the number format for an increased number of parallel

                                                                                          arithmetic units This has driven a new type of processing called

                                                                                          reconfigurable computing where time intensive tasks are offloaded from

                                                                                          software to FPGAs

                                                                                          The adoption of FPGAs in high performance computing is currently limited by

                                                                                          the complexity of FPGA design compared to conventional software and the

                                                                                          extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                          is necessary after even minor changes to the source code

                                                                                          Traditionally FPGAs have been reserved for specific vertical applications

                                                                                          where the volume of production is small For these low-volume applications

                                                                                          the premium that companies pay in hardware costs per unit for a

                                                                                          programmable chip is more affordable than the development resources spent

                                                                                          on creating an ASIC for a low-volume application Today new cost and

                                                                                          performance dynamics have broadened the range of viable applications

                                                                                          Architecture

                                                                                          The most common FPGA architecture consists of an array of configurable

                                                                                          logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                          channels have the same width (number of wires) Multiple IO pads may fit

                                                                                          into the height of one row or the width of one column in the array

                                                                                          An application circuit must be mapped into an FPGA with adequate resources

                                                                                          While the number of CLBs and IOs required is easily determined from the

                                                                                          design the number of routing tracks needed may vary considerably even

                                                                                          among designs with the same amount of logic (For example a crossbar switch

                                                                                          requires much more routing than a systolic array with the same gate count)

                                                                                          Since unused routing tracks increase the cost (and decrease the performance)

                                                                                          of the part without providing any benefit FPGA manufacturers try to provide

                                                                                          58

                                                                                          just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                          can be routed This is determined by estimates such as those derived from

                                                                                          Rents rule or by experiments with existing designs

                                                                                          The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                          blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                          width of one column as shown below All the routing channels have the same

                                                                                          width (number of wires)

                                                                                          121 FPGA structure

                                                                                          A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                          flip-flop as shown below In recent years manufacturers have started moving

                                                                                          to 6-input LUTs in their high performance parts claiming increased

                                                                                          performance

                                                                                          Typical logic block

                                                                                          59

                                                                                          There is only one output which can be either the registered or the unregistered

                                                                                          LUT output The logic block has four inputs for the LUT and a clock input

                                                                                          Since clock signals (and often other high-fanout signals) are normally routed

                                                                                          via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                          and other signals are separately managed

                                                                                          For this example architecture the locations of the FPGA logic block pins are

                                                                                          shown below

                                                                                          Logic Block Pin Locations

                                                                                          Each input is accessible from one side of the logic block while the output pin

                                                                                          can connect to routing wires in both the channel to the right and the channel

                                                                                          below the logic block

                                                                                          Each logic block output pin can connect to any of the wiring segments in the

                                                                                          channels adjacent to it

                                                                                          Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                          channel adjacent to it For example an IO pad at the top of the chip can

                                                                                          connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                          channel immediately below it

                                                                                          Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                          spans only one logic block before it terminates in a switch box By turning on

                                                                                          60

                                                                                          some of the programmable switches within a switch box longer paths can be

                                                                                          constructed For higher speed interconnect some FPGA architectures use

                                                                                          longer routing lines that span multiple logic blocks

                                                                                          Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                          this architecture when a wire enters a switch box there are three

                                                                                          programmable switches that allow it to connect to three other wires in adjacent

                                                                                          channel segments The pattern or topology of switches used in this

                                                                                          architecture is the planar or domain-based switch box topology In this switch

                                                                                          box topology a wire in track number one connects only to wires in track

                                                                                          number one in adjacent channel segments wires in track number 2 connect

                                                                                          only to other wires in track number 2 and so on The figure below illustrates

                                                                                          the connections in a switch box

                                                                                          Switch box topology

                                                                                          Modern FPGA families expand upon the above capabilities to include higher

                                                                                          level functionality fixed into the silicon Having these common functions

                                                                                          embedded into the silicon reduces the area required and gives those functions

                                                                                          increased speed compared to building them from primitives Examples of these

                                                                                          include multipliers generic DSP blocks embedded processors high speed IO

                                                                                          logic and embedded memories

                                                                                          61

                                                                                          FPGAs are also widely used for systems validation including pre-silicon

                                                                                          validation post-silicon validation and firmware development This allows chip

                                                                                          companies to validate their design before the chip is produced in the factory

                                                                                          reducing the time to market

                                                                                          124 FPGA Design and Programming

                                                                                          To define the behavior of the FPGA the user provides a hardware description

                                                                                          language (HDL) or a schematic design The HDL form might be easier to work

                                                                                          with when handling large structures because its possible to just specify them

                                                                                          numerically rather than having to draw every piece by hand On the other hand

                                                                                          schematic entry can allow for easier visualisation of a design

                                                                                          Then using an electronic design automation tool a technology-mapped netlist

                                                                                          is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                          using a process called place-and-route usually performed by the FPGA

                                                                                          companys proprietary place-and-route software The user will validate the

                                                                                          map place and route results via timing analysis simulation and other

                                                                                          verification methodologies Once the design and validation process is

                                                                                          complete the binary file generated (also using the FPGA companys

                                                                                          proprietary software) is used to (re)configure the FPGA

                                                                                          Going from schematicHDL source files to actual configuration The source

                                                                                          files are fed to a software suite from the FPGACPLD vendor that through

                                                                                          different steps will produce a file This file is then transferred to the

                                                                                          FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                          like an EEPROM

                                                                                          The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                          reduce the complexity of designing in HDLs which have been compared to the

                                                                                          equivalent of assembly languages there are moves to raise the abstraction level

                                                                                          through the introduction of alternative languages

                                                                                          62

                                                                                          To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                          predefined complex functions and circuits that have been tested and optimized

                                                                                          to speed up the design process These predefined circuits are commonly called

                                                                                          IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                          (rarely free and typically released under proprietary licenses) Other

                                                                                          predefined circuits are available from developer communities such as Open

                                                                                          Cores (typically free and released under the GPL BSD or similar license) and

                                                                                          other sources

                                                                                          In a typical design flow an FPGA application developer will simulate the

                                                                                          design at multiple stages throughout the design process Initially the RTL

                                                                                          description in VHDL or Verilog is simulated by creating test benches to

                                                                                          simulate the system and observe results Then after the synthesis engine has

                                                                                          mapped the design to a netlist the netlist is translated to a gate level

                                                                                          description where simulation is repeated to confirm the synthesis proceeded

                                                                                          without errors Finally the design is laid out in the FPGA at which point

                                                                                          propagation delays can be added and the simulation run again with these

                                                                                          values back-annotated onto the netlist

                                                                                          Basic Process Technology Types

                                                                                          SRAM - based on static memory technology In-system programmable and re-

                                                                                          programmable Requires external boot devices CMOS

                                                                                          Antifuse - One-time programmable CMOS

                                                                                          EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                          one-time programmable in production because of plastic packaging

                                                                                          Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                          EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                          technology Can be erased even in plastic packages Some but not all

                                                                                          EEPROM devices can be in-system programmed CMOS

                                                                                          Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                          packages Some but not all flash devices can be in-system programmed

                                                                                          63

                                                                                          Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                          therefore less expensive to manufacture CMOS

                                                                                          Fuse - One-time programmable Bipolar

                                                                                          Major Manufacturers

                                                                                          Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                          rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                          representing over 50 percent

                                                                                          Xilinx also provides free Windows and Linux design software while Altera

                                                                                          provides free Windows tools the Solaris and Linux tools are only available via

                                                                                          a rental scheme

                                                                                          Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                          (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                          Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                          focused CSSP no general purpose FPGAs)

                                                                                          125 FPGA prototype

                                                                                          FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                          prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                          hardware verification and early software development

                                                                                          Main stream verification methods for hardware design and early software and

                                                                                          firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                          design on FPGA has become a good method to do this

                                                                                          64

                                                                                          Reasons why Prototyping is important

                                                                                          1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                          functionally correct This is compared to designers only relying on software

                                                                                          simulations to verify that their hardware design is sound Simulation speed and

                                                                                          modeling accuracy limitations hinder this development

                                                                                          2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                          the foundry to start on software tests FPGA prototyping allows for much more

                                                                                          time in area of software development and testing at the software-hardware

                                                                                          integration stage This allows many unforeseen software bugs that appear due

                                                                                          to todays array of operating systems applications and hardware

                                                                                          3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                          system work well together off the simulation stage and in actual form

                                                                                          4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                          bringing in interest early This speeds up the overall development cycle and

                                                                                          allows for more enhancement or improvement to the chip features as it would

                                                                                          otherwise have been

                                                                                          65

                                                                                          13 SIMULATION RESULTS

                                                                                          66

                                                                                          GATE LEVEL

                                                                                          FigVGA CONTROLLER

                                                                                          67

                                                                                          GATE LEVEL

                                                                                          Fig SYNC UNIT

                                                                                          68

                                                                                          TECHNOLOGY SCHEMATIC

                                                                                          69

                                                                                          14 APPLICATIONS

                                                                                          Motion Tracking Systems

                                                                                          Image processing systems

                                                                                          Displaying Systems

                                                                                          141 CONCLUTION

                                                                                          Hence by implementing FPGA taking a processor-generated picture

                                                                                          (pixilated) from memory space the Controller provides digital RGB values for

                                                                                          each pixel as well as horizontal and vertical synchronization signals in order

                                                                                          to correctly display the picture on a connected monitor

                                                                                          70

                                                                                          REFERENCES

                                                                                          1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                          2 A VGA display controller by Eduardo Sanchez

                                                                                          3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                          71

                                                                                          • Pixel
                                                                                            • 6 SIGNALS IN VGA CONTROLLER
                                                                                            • Gates
                                                                                            • Market size
                                                                                            • 122 FPGA Comparisons

                                                                                            SYNCHRONIZATION BLOCK CODE

                                                                                            library IEEE

                                                                                            use IEEESTD_LOGIC_1164ALL

                                                                                            use IEEESTD_LOGIC_ARITHALL

                                                                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                            entity sync_unit is port(

                                                                                            clkin std_logic

                                                                                            rstnin std_logic

                                                                                            hsyncout std_logic ---horizontal synch pulse(row)

                                                                                            vsyncout std_logic --vertical sync

                                                                                            pulse(frame) 111110100000000000

                                                                                            v_en_regout std_logic

                                                                                            h_enout std_logic

                                                                                            v_enout std_logic

                                                                                            line_countout integer

                                                                                            c_hsout integer )

                                                                                            end sync_unit

                                                                                            architecture arch_sync_unit of sync_unit is

                                                                                            signal h_en_sv_en_sstd_logic

                                                                                            signal line_count_sc_hs_sinteger

                                                                                            begin

                                                                                            c_hs lt= c_hs_s

                                                                                            line_count lt= line_count_s

                                                                                            h_en lt= h_en_s

                                                                                            v_en lt= v_en_s

                                                                                            -- hsync counter

                                                                                            process(clkrstn)

                                                                                            begin

                                                                                            if(rstn = 0)then

                                                                                            46

                                                                                            c_hs_s lt= 0

                                                                                            else

                                                                                            if(clk = 1 and clkevent)then

                                                                                            if(c_hs_s=793)then

                                                                                            c_hs_s lt= 0

                                                                                            else

                                                                                            c_hs_s lt= c_hs_s +1

                                                                                            end if

                                                                                            end if

                                                                                            end if

                                                                                            end process

                                                                                            --vsync counter

                                                                                            --vertical line counter

                                                                                            process(h_en_srstn)

                                                                                            begin

                                                                                            if(rstn = 0)then

                                                                                            line_count_s lt= 0

                                                                                            elsif(h_en_s = 1 and h_en_sevent)then

                                                                                            line_count_s lt= line_count_s + 1

                                                                                            if (line_count_s=515)then

                                                                                            line_count_s lt= 0

                                                                                            end if

                                                                                            end if

                                                                                            --end if

                                                                                            end process

                                                                                            --hysnc pulse

                                                                                            process(clkrstn)

                                                                                            begin

                                                                                            if(rstn = 0)then

                                                                                            47

                                                                                            hsync lt= 1

                                                                                            elsif(clk = 1 and clkevent)then

                                                                                            if(c_hs_s lt=95)then

                                                                                            hsync lt= 0

                                                                                            else

                                                                                            hsync lt= 1

                                                                                            end if

                                                                                            end if

                                                                                            --end if

                                                                                            end process

                                                                                            process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                                            this period)

                                                                                            begin

                                                                                            if(rstn = 0)then

                                                                                            h_en_s lt= 0

                                                                                            elsif(clk = 1 and clkevent)then

                                                                                            if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                                            h_en_s lt= 1

                                                                                            else

                                                                                            h_en_s lt= 0

                                                                                            end if

                                                                                            end if

                                                                                            --end if

                                                                                            end process

                                                                                            ---vysnc pulse

                                                                                            process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                                            begin

                                                                                            if(rstn = 0)then

                                                                                            48

                                                                                            v_en_s lt= 0

                                                                                            elsif(clk = 1 and clkevent)then

                                                                                            if((line_count_sgt=35 and line_count_slt=515))then

                                                                                            v_en_s lt= 1

                                                                                            else

                                                                                            v_en_s lt= 0

                                                                                            end if

                                                                                            end if

                                                                                            --end if

                                                                                            end process

                                                                                            process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                            begin

                                                                                            if(rstn = 0)then

                                                                                            vsync lt= 1

                                                                                            elsif(clk = 1 and clkevent)then

                                                                                            if(line_count_s lt2)then

                                                                                            vsync lt= 0

                                                                                            else

                                                                                            vsync lt= 1

                                                                                            end if

                                                                                            end if

                                                                                            --end if

                                                                                            end process

                                                                                            process(clk)

                                                                                            begin

                                                                                            if(clk = 1 and clkevent)then

                                                                                            v_en_reglt=v_en_s

                                                                                            end if

                                                                                            end process

                                                                                            end arch_sync_unit

                                                                                            49

                                                                                            PIXEL GENERATOR

                                                                                            library IEEE

                                                                                            use IEEESTD_LOGIC_1164ALL

                                                                                            use IEEESTD_LOGIC_ARITHALL

                                                                                            use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                            entity pixelgen is port(

                                                                                            clkin std_logic

                                                                                            rstnin std_logic

                                                                                            hsyncin std_logic

                                                                                            vsyncin std_logic

                                                                                            h_enin std_logic

                                                                                            v_enin std_logic

                                                                                            v_en_reg in std_logic

                                                                                            data_inin std_logic_vector(0 downto 0)

                                                                                            line_countin integer

                                                                                            c_hsin integer

                                                                                            red out std_logic_vector(0 downto 0)

                                                                                            greenout std_logic_vector(0 downto 0)

                                                                                            blueout std_logic_vector(0 downto 0)

                                                                                            addrout std_logic_vector(17 downto 0)

                                                                                            rdout std_logic

                                                                                            )

                                                                                            end pixelgen

                                                                                            architecture behv of pixelgen is

                                                                                            signal addr_sstd_logic_vector(17 downto 0)

                                                                                            begin

                                                                                            addr lt= addr_s

                                                                                            process(clkrstn)

                                                                                            begin

                                                                                            50

                                                                                            if(rstn = 0)then

                                                                                            rd lt= 0

                                                                                            red lt= 0

                                                                                            green lt= 0

                                                                                            blue lt= 0

                                                                                            addr_s lt= x0000 amp 00

                                                                                            elsif (clk = 1 and clkevent)then

                                                                                            if((v_en=1) and (v_en_reg=0))then

                                                                                            addr_slt=(others =gt0)

                                                                                            end if

                                                                                            if(v_en=1 and h_en=1)then

                                                                                            if(line_countlt=514)then

                                                                                            if(c_hs lt= 655)then

                                                                                            rd lt= 1

                                                                                            red lt= data_in

                                                                                            green lt= data_in

                                                                                            blue lt= data_in

                                                                                            addr_s lt=addr_s +000000000000000001

                                                                                            else

                                                                                            rd lt= 0

                                                                                            red lt= 0

                                                                                            green lt= 1

                                                                                            blue lt= 0

                                                                                            end if

                                                                                            else

                                                                                            rd lt= 0

                                                                                            red lt= 1

                                                                                            green lt= 1

                                                                                            blue lt= 1

                                                                                            51

                                                                                            end if

                                                                                            if(addr_s=111011111111111111)then

                                                                                            addr_slt= (others =gt0)

                                                                                            end if

                                                                                            else

                                                                                            red lt= 0

                                                                                            green lt= 0

                                                                                            blue lt= 0

                                                                                            end if

                                                                                            end if

                                                                                            -- end if

                                                                                            --end if

                                                                                            end process

                                                                                            end behv

                                                                                            52

                                                                                            12 FPGA ARCHITECTURE

                                                                                            A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                            be configured by the customer or designer after manufacturingmdashhence the

                                                                                            name field-programmable FPGAs are programmed using a logic circuit

                                                                                            diagram or a source code in a hardware description language (HDL) to specify

                                                                                            how the chip will work They can be used to implement any logical function

                                                                                            that an application-specific integrated circuit (ASIC) could perform but the

                                                                                            ability to update the functionality after shipping offers advantages for many

                                                                                            applications

                                                                                            FPGAs contain programmable logic components called logic blocks and a

                                                                                            hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                            togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                            can be configured to perform complex combinational functions or merely

                                                                                            simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                            include memory elements which may be simple flip-flops or more complete

                                                                                            blocks of memory

                                                                                            History The FPGA industry sprouted from programmable read only

                                                                                            memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                            both had the option of being programmed in batches in a factory or in the field

                                                                                            (field programmable) however programmable logic was hard-wired between

                                                                                            logic gates

                                                                                            Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                            first commercially viable field programmable gate array in 1985 ndash the

                                                                                            XC2064 The XC2064 had programmable gates and programmable

                                                                                            interconnects between gates the beginnings of a new technology and market

                                                                                            The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                            input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                            into the National Inventors Hall of Fame for his invention

                                                                                            53

                                                                                            Some of the industryrsquos foundational concepts and technologies for

                                                                                            programmable logic arrays gates and logic blocks are founded in patents

                                                                                            awarded to David W Page and LuVerne R Peterson in 1985

                                                                                            In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                            proposed by Steve Casselman to develop a computer that would implement

                                                                                            600000 reprogrammable gates Casselman was successful and the system was

                                                                                            awarded a patent in 1992

                                                                                            Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                            1990s when competitors sprouted up eroding significant market-share By

                                                                                            1993 Actel was serving about 18 percent of the market

                                                                                            The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                            and the volume of production In the early 1990s FPGAs were primarily used

                                                                                            in telecommunications and networking By the end of the decade FPGAs

                                                                                            found their way into consumer automotive and industrial applications

                                                                                            FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                            algorithm technology and FPGAs to create a sound recognition device

                                                                                            Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                            to decide the configuration needed to accomplish a sound recognition task

                                                                                            121 Modern developments

                                                                                            A recent trend has been to take the coarse-grained architectural

                                                                                            approach a step further by combining the logic blocks and interconnects of

                                                                                            traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                            form a complete system on a programmable chip This work mirrors the

                                                                                            architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                            Group which combined a reconfigurable CPU architecture on a single chip

                                                                                            called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                            technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                            which include one or more PowerPC processors embedded within the FPGAs

                                                                                            54

                                                                                            logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                            processor in combination with Atmels programmable logic architecture

                                                                                            An alternate approach to using hard-macro processors is to make use of soft

                                                                                            processor cores that are implemented within the FPGA logic (See Soft

                                                                                            processors below)

                                                                                            As previously mentioned many modern FPGAs have the ability to be

                                                                                            reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                            computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                            suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                            example of a reconfigurable soft processor implemented on FPGAs However

                                                                                            it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                            to a specific program

                                                                                            Additionally new non-FPGA architectures are beginning to emerge

                                                                                            Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                            hybrid approach by providing an array of processor cores and FPGA-like

                                                                                            programmable cores on the same chip

                                                                                            Gates

                                                                                            1987 9000 gates Xilinx

                                                                                            1992 600000 Naval Surface Warfare Department

                                                                                            Early 2000s Millions

                                                                                            Market size

                                                                                            1985 First commercial FPGA technology invented by Xilinx

                                                                                            1987 $14 million

                                                                                            ~1993 gt$385 million

                                                                                            2005 $19 billion

                                                                                            2010 estimates $275 billion

                                                                                            55

                                                                                            122 FPGA Comparisons

                                                                                            Historically FPGAs have been slower less energy efficient and generally

                                                                                            achieved less functionality than their fixed ASIC counterparts A combination

                                                                                            of volume fabrication improvements research and development and the IO

                                                                                            capabilities of new supercomputers have largely closed the performance gap

                                                                                            between ASICs and FPGAs

                                                                                            Advantages include a shorter time to market ability to re-program in the field

                                                                                            to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                            middle road by developing their hardware on ordinary FPGAs but

                                                                                            manufacture their final version so it can no longer be modified after the design

                                                                                            has been committed

                                                                                            Xilinx claims that several market and technology dynamics are changing the

                                                                                            ASICFPGA paradigm

                                                                                            IC costs are rising aggressively

                                                                                            ASIC complexity has bolstered development time and costs

                                                                                            RampD resources and headcount is decreasing

                                                                                            Revenue losses for slow time-to-market are increasing

                                                                                            Financial constraints in a poor economy are driving low-cost technologies

                                                                                            These trends make FPGAs a better alternative than ASICs for a growing

                                                                                            number of higher-volume applications than they have been historically used

                                                                                            for which the company blames for the growing number of FPGA design starts

                                                                                            (see History)

                                                                                            The primary differences between CPLDs and FPGAs are architectural A

                                                                                            CPLD has a somewhat restrictive structure consisting of one or more

                                                                                            programmable sum-of-products logic arrays feeding a relatively small number

                                                                                            of clocked registers The result of this is less flexibility with the advantage of

                                                                                            more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                            FPGA architectures on the other hand are dominated by interconnect This

                                                                                            56

                                                                                            makes them far more flexible (in terms of the range of designs that are

                                                                                            practical for implementation within them) but also far more complex to design

                                                                                            for

                                                                                            Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                            FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                            and embedded memories as well as to have logic blocks implement decoders

                                                                                            or mathematical functions

                                                                                            Some FPGAs have the capability of partial re-configuration that lets one

                                                                                            portion of the device be re-programmed while other portions continue running

                                                                                            123 Applications

                                                                                            Applications of FPGAs include digital signal processing software-defined

                                                                                            radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                            computer vision speech recognition cryptography bioinformatics computer

                                                                                            hardware emulation radio astronomy and a growing range of other areas

                                                                                            FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                            space that of glue logic for PCBs As their size capabilities and speed

                                                                                            increased they began to take over larger and larger functions to the state where

                                                                                            some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                            introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                            applications which had traditionally been the sole reserve of DSPs began to

                                                                                            incorporate FPGAs instead

                                                                                            FPGAs especially find applications in any area or algorithm that can make use

                                                                                            of the massive parallelism offered by their architecture One such area is code

                                                                                            breaking in particular brute-force attack of cryptographic algorithms

                                                                                            FPGAs are increasingly used in conventional high performance computing

                                                                                            applications where computational kernels such as FFT or Convolution are

                                                                                            performed on the FPGA instead of a microprocessor

                                                                                            57

                                                                                            The inherent parallelism of the logic resources on an FPGA allows for

                                                                                            considerable computational throughput even at a low MHz clock rates The

                                                                                            flexibility of the FPGA allows for even higher performance by trading off

                                                                                            precision and range in the number format for an increased number of parallel

                                                                                            arithmetic units This has driven a new type of processing called

                                                                                            reconfigurable computing where time intensive tasks are offloaded from

                                                                                            software to FPGAs

                                                                                            The adoption of FPGAs in high performance computing is currently limited by

                                                                                            the complexity of FPGA design compared to conventional software and the

                                                                                            extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                            is necessary after even minor changes to the source code

                                                                                            Traditionally FPGAs have been reserved for specific vertical applications

                                                                                            where the volume of production is small For these low-volume applications

                                                                                            the premium that companies pay in hardware costs per unit for a

                                                                                            programmable chip is more affordable than the development resources spent

                                                                                            on creating an ASIC for a low-volume application Today new cost and

                                                                                            performance dynamics have broadened the range of viable applications

                                                                                            Architecture

                                                                                            The most common FPGA architecture consists of an array of configurable

                                                                                            logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                            channels have the same width (number of wires) Multiple IO pads may fit

                                                                                            into the height of one row or the width of one column in the array

                                                                                            An application circuit must be mapped into an FPGA with adequate resources

                                                                                            While the number of CLBs and IOs required is easily determined from the

                                                                                            design the number of routing tracks needed may vary considerably even

                                                                                            among designs with the same amount of logic (For example a crossbar switch

                                                                                            requires much more routing than a systolic array with the same gate count)

                                                                                            Since unused routing tracks increase the cost (and decrease the performance)

                                                                                            of the part without providing any benefit FPGA manufacturers try to provide

                                                                                            58

                                                                                            just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                            can be routed This is determined by estimates such as those derived from

                                                                                            Rents rule or by experiments with existing designs

                                                                                            The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                            blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                            width of one column as shown below All the routing channels have the same

                                                                                            width (number of wires)

                                                                                            121 FPGA structure

                                                                                            A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                            flip-flop as shown below In recent years manufacturers have started moving

                                                                                            to 6-input LUTs in their high performance parts claiming increased

                                                                                            performance

                                                                                            Typical logic block

                                                                                            59

                                                                                            There is only one output which can be either the registered or the unregistered

                                                                                            LUT output The logic block has four inputs for the LUT and a clock input

                                                                                            Since clock signals (and often other high-fanout signals) are normally routed

                                                                                            via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                            and other signals are separately managed

                                                                                            For this example architecture the locations of the FPGA logic block pins are

                                                                                            shown below

                                                                                            Logic Block Pin Locations

                                                                                            Each input is accessible from one side of the logic block while the output pin

                                                                                            can connect to routing wires in both the channel to the right and the channel

                                                                                            below the logic block

                                                                                            Each logic block output pin can connect to any of the wiring segments in the

                                                                                            channels adjacent to it

                                                                                            Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                            channel adjacent to it For example an IO pad at the top of the chip can

                                                                                            connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                            channel immediately below it

                                                                                            Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                            spans only one logic block before it terminates in a switch box By turning on

                                                                                            60

                                                                                            some of the programmable switches within a switch box longer paths can be

                                                                                            constructed For higher speed interconnect some FPGA architectures use

                                                                                            longer routing lines that span multiple logic blocks

                                                                                            Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                            this architecture when a wire enters a switch box there are three

                                                                                            programmable switches that allow it to connect to three other wires in adjacent

                                                                                            channel segments The pattern or topology of switches used in this

                                                                                            architecture is the planar or domain-based switch box topology In this switch

                                                                                            box topology a wire in track number one connects only to wires in track

                                                                                            number one in adjacent channel segments wires in track number 2 connect

                                                                                            only to other wires in track number 2 and so on The figure below illustrates

                                                                                            the connections in a switch box

                                                                                            Switch box topology

                                                                                            Modern FPGA families expand upon the above capabilities to include higher

                                                                                            level functionality fixed into the silicon Having these common functions

                                                                                            embedded into the silicon reduces the area required and gives those functions

                                                                                            increased speed compared to building them from primitives Examples of these

                                                                                            include multipliers generic DSP blocks embedded processors high speed IO

                                                                                            logic and embedded memories

                                                                                            61

                                                                                            FPGAs are also widely used for systems validation including pre-silicon

                                                                                            validation post-silicon validation and firmware development This allows chip

                                                                                            companies to validate their design before the chip is produced in the factory

                                                                                            reducing the time to market

                                                                                            124 FPGA Design and Programming

                                                                                            To define the behavior of the FPGA the user provides a hardware description

                                                                                            language (HDL) or a schematic design The HDL form might be easier to work

                                                                                            with when handling large structures because its possible to just specify them

                                                                                            numerically rather than having to draw every piece by hand On the other hand

                                                                                            schematic entry can allow for easier visualisation of a design

                                                                                            Then using an electronic design automation tool a technology-mapped netlist

                                                                                            is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                            using a process called place-and-route usually performed by the FPGA

                                                                                            companys proprietary place-and-route software The user will validate the

                                                                                            map place and route results via timing analysis simulation and other

                                                                                            verification methodologies Once the design and validation process is

                                                                                            complete the binary file generated (also using the FPGA companys

                                                                                            proprietary software) is used to (re)configure the FPGA

                                                                                            Going from schematicHDL source files to actual configuration The source

                                                                                            files are fed to a software suite from the FPGACPLD vendor that through

                                                                                            different steps will produce a file This file is then transferred to the

                                                                                            FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                            like an EEPROM

                                                                                            The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                            reduce the complexity of designing in HDLs which have been compared to the

                                                                                            equivalent of assembly languages there are moves to raise the abstraction level

                                                                                            through the introduction of alternative languages

                                                                                            62

                                                                                            To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                            predefined complex functions and circuits that have been tested and optimized

                                                                                            to speed up the design process These predefined circuits are commonly called

                                                                                            IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                            (rarely free and typically released under proprietary licenses) Other

                                                                                            predefined circuits are available from developer communities such as Open

                                                                                            Cores (typically free and released under the GPL BSD or similar license) and

                                                                                            other sources

                                                                                            In a typical design flow an FPGA application developer will simulate the

                                                                                            design at multiple stages throughout the design process Initially the RTL

                                                                                            description in VHDL or Verilog is simulated by creating test benches to

                                                                                            simulate the system and observe results Then after the synthesis engine has

                                                                                            mapped the design to a netlist the netlist is translated to a gate level

                                                                                            description where simulation is repeated to confirm the synthesis proceeded

                                                                                            without errors Finally the design is laid out in the FPGA at which point

                                                                                            propagation delays can be added and the simulation run again with these

                                                                                            values back-annotated onto the netlist

                                                                                            Basic Process Technology Types

                                                                                            SRAM - based on static memory technology In-system programmable and re-

                                                                                            programmable Requires external boot devices CMOS

                                                                                            Antifuse - One-time programmable CMOS

                                                                                            EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                            one-time programmable in production because of plastic packaging

                                                                                            Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                            EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                            technology Can be erased even in plastic packages Some but not all

                                                                                            EEPROM devices can be in-system programmed CMOS

                                                                                            Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                            packages Some but not all flash devices can be in-system programmed

                                                                                            63

                                                                                            Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                            therefore less expensive to manufacture CMOS

                                                                                            Fuse - One-time programmable Bipolar

                                                                                            Major Manufacturers

                                                                                            Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                            rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                            representing over 50 percent

                                                                                            Xilinx also provides free Windows and Linux design software while Altera

                                                                                            provides free Windows tools the Solaris and Linux tools are only available via

                                                                                            a rental scheme

                                                                                            Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                            (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                            Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                            focused CSSP no general purpose FPGAs)

                                                                                            125 FPGA prototype

                                                                                            FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                            prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                            hardware verification and early software development

                                                                                            Main stream verification methods for hardware design and early software and

                                                                                            firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                            design on FPGA has become a good method to do this

                                                                                            64

                                                                                            Reasons why Prototyping is important

                                                                                            1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                            functionally correct This is compared to designers only relying on software

                                                                                            simulations to verify that their hardware design is sound Simulation speed and

                                                                                            modeling accuracy limitations hinder this development

                                                                                            2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                            the foundry to start on software tests FPGA prototyping allows for much more

                                                                                            time in area of software development and testing at the software-hardware

                                                                                            integration stage This allows many unforeseen software bugs that appear due

                                                                                            to todays array of operating systems applications and hardware

                                                                                            3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                            system work well together off the simulation stage and in actual form

                                                                                            4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                            bringing in interest early This speeds up the overall development cycle and

                                                                                            allows for more enhancement or improvement to the chip features as it would

                                                                                            otherwise have been

                                                                                            65

                                                                                            13 SIMULATION RESULTS

                                                                                            66

                                                                                            GATE LEVEL

                                                                                            FigVGA CONTROLLER

                                                                                            67

                                                                                            GATE LEVEL

                                                                                            Fig SYNC UNIT

                                                                                            68

                                                                                            TECHNOLOGY SCHEMATIC

                                                                                            69

                                                                                            14 APPLICATIONS

                                                                                            Motion Tracking Systems

                                                                                            Image processing systems

                                                                                            Displaying Systems

                                                                                            141 CONCLUTION

                                                                                            Hence by implementing FPGA taking a processor-generated picture

                                                                                            (pixilated) from memory space the Controller provides digital RGB values for

                                                                                            each pixel as well as horizontal and vertical synchronization signals in order

                                                                                            to correctly display the picture on a connected monitor

                                                                                            70

                                                                                            REFERENCES

                                                                                            1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                            2 A VGA display controller by Eduardo Sanchez

                                                                                            3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                            71

                                                                                            • Pixel
                                                                                              • 6 SIGNALS IN VGA CONTROLLER
                                                                                              • Gates
                                                                                              • Market size
                                                                                              • 122 FPGA Comparisons

                                                                                              c_hs_s lt= 0

                                                                                              else

                                                                                              if(clk = 1 and clkevent)then

                                                                                              if(c_hs_s=793)then

                                                                                              c_hs_s lt= 0

                                                                                              else

                                                                                              c_hs_s lt= c_hs_s +1

                                                                                              end if

                                                                                              end if

                                                                                              end if

                                                                                              end process

                                                                                              --vsync counter

                                                                                              --vertical line counter

                                                                                              process(h_en_srstn)

                                                                                              begin

                                                                                              if(rstn = 0)then

                                                                                              line_count_s lt= 0

                                                                                              elsif(h_en_s = 1 and h_en_sevent)then

                                                                                              line_count_s lt= line_count_s + 1

                                                                                              if (line_count_s=515)then

                                                                                              line_count_s lt= 0

                                                                                              end if

                                                                                              end if

                                                                                              --end if

                                                                                              end process

                                                                                              --hysnc pulse

                                                                                              process(clkrstn)

                                                                                              begin

                                                                                              if(rstn = 0)then

                                                                                              47

                                                                                              hsync lt= 1

                                                                                              elsif(clk = 1 and clkevent)then

                                                                                              if(c_hs_s lt=95)then

                                                                                              hsync lt= 0

                                                                                              else

                                                                                              hsync lt= 1

                                                                                              end if

                                                                                              end if

                                                                                              --end if

                                                                                              end process

                                                                                              process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                                              this period)

                                                                                              begin

                                                                                              if(rstn = 0)then

                                                                                              h_en_s lt= 0

                                                                                              elsif(clk = 1 and clkevent)then

                                                                                              if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                                              h_en_s lt= 1

                                                                                              else

                                                                                              h_en_s lt= 0

                                                                                              end if

                                                                                              end if

                                                                                              --end if

                                                                                              end process

                                                                                              ---vysnc pulse

                                                                                              process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                                              begin

                                                                                              if(rstn = 0)then

                                                                                              48

                                                                                              v_en_s lt= 0

                                                                                              elsif(clk = 1 and clkevent)then

                                                                                              if((line_count_sgt=35 and line_count_slt=515))then

                                                                                              v_en_s lt= 1

                                                                                              else

                                                                                              v_en_s lt= 0

                                                                                              end if

                                                                                              end if

                                                                                              --end if

                                                                                              end process

                                                                                              process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                              begin

                                                                                              if(rstn = 0)then

                                                                                              vsync lt= 1

                                                                                              elsif(clk = 1 and clkevent)then

                                                                                              if(line_count_s lt2)then

                                                                                              vsync lt= 0

                                                                                              else

                                                                                              vsync lt= 1

                                                                                              end if

                                                                                              end if

                                                                                              --end if

                                                                                              end process

                                                                                              process(clk)

                                                                                              begin

                                                                                              if(clk = 1 and clkevent)then

                                                                                              v_en_reglt=v_en_s

                                                                                              end if

                                                                                              end process

                                                                                              end arch_sync_unit

                                                                                              49

                                                                                              PIXEL GENERATOR

                                                                                              library IEEE

                                                                                              use IEEESTD_LOGIC_1164ALL

                                                                                              use IEEESTD_LOGIC_ARITHALL

                                                                                              use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                              entity pixelgen is port(

                                                                                              clkin std_logic

                                                                                              rstnin std_logic

                                                                                              hsyncin std_logic

                                                                                              vsyncin std_logic

                                                                                              h_enin std_logic

                                                                                              v_enin std_logic

                                                                                              v_en_reg in std_logic

                                                                                              data_inin std_logic_vector(0 downto 0)

                                                                                              line_countin integer

                                                                                              c_hsin integer

                                                                                              red out std_logic_vector(0 downto 0)

                                                                                              greenout std_logic_vector(0 downto 0)

                                                                                              blueout std_logic_vector(0 downto 0)

                                                                                              addrout std_logic_vector(17 downto 0)

                                                                                              rdout std_logic

                                                                                              )

                                                                                              end pixelgen

                                                                                              architecture behv of pixelgen is

                                                                                              signal addr_sstd_logic_vector(17 downto 0)

                                                                                              begin

                                                                                              addr lt= addr_s

                                                                                              process(clkrstn)

                                                                                              begin

                                                                                              50

                                                                                              if(rstn = 0)then

                                                                                              rd lt= 0

                                                                                              red lt= 0

                                                                                              green lt= 0

                                                                                              blue lt= 0

                                                                                              addr_s lt= x0000 amp 00

                                                                                              elsif (clk = 1 and clkevent)then

                                                                                              if((v_en=1) and (v_en_reg=0))then

                                                                                              addr_slt=(others =gt0)

                                                                                              end if

                                                                                              if(v_en=1 and h_en=1)then

                                                                                              if(line_countlt=514)then

                                                                                              if(c_hs lt= 655)then

                                                                                              rd lt= 1

                                                                                              red lt= data_in

                                                                                              green lt= data_in

                                                                                              blue lt= data_in

                                                                                              addr_s lt=addr_s +000000000000000001

                                                                                              else

                                                                                              rd lt= 0

                                                                                              red lt= 0

                                                                                              green lt= 1

                                                                                              blue lt= 0

                                                                                              end if

                                                                                              else

                                                                                              rd lt= 0

                                                                                              red lt= 1

                                                                                              green lt= 1

                                                                                              blue lt= 1

                                                                                              51

                                                                                              end if

                                                                                              if(addr_s=111011111111111111)then

                                                                                              addr_slt= (others =gt0)

                                                                                              end if

                                                                                              else

                                                                                              red lt= 0

                                                                                              green lt= 0

                                                                                              blue lt= 0

                                                                                              end if

                                                                                              end if

                                                                                              -- end if

                                                                                              --end if

                                                                                              end process

                                                                                              end behv

                                                                                              52

                                                                                              12 FPGA ARCHITECTURE

                                                                                              A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                              be configured by the customer or designer after manufacturingmdashhence the

                                                                                              name field-programmable FPGAs are programmed using a logic circuit

                                                                                              diagram or a source code in a hardware description language (HDL) to specify

                                                                                              how the chip will work They can be used to implement any logical function

                                                                                              that an application-specific integrated circuit (ASIC) could perform but the

                                                                                              ability to update the functionality after shipping offers advantages for many

                                                                                              applications

                                                                                              FPGAs contain programmable logic components called logic blocks and a

                                                                                              hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                              togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                              can be configured to perform complex combinational functions or merely

                                                                                              simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                              include memory elements which may be simple flip-flops or more complete

                                                                                              blocks of memory

                                                                                              History The FPGA industry sprouted from programmable read only

                                                                                              memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                              both had the option of being programmed in batches in a factory or in the field

                                                                                              (field programmable) however programmable logic was hard-wired between

                                                                                              logic gates

                                                                                              Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                              first commercially viable field programmable gate array in 1985 ndash the

                                                                                              XC2064 The XC2064 had programmable gates and programmable

                                                                                              interconnects between gates the beginnings of a new technology and market

                                                                                              The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                              input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                              into the National Inventors Hall of Fame for his invention

                                                                                              53

                                                                                              Some of the industryrsquos foundational concepts and technologies for

                                                                                              programmable logic arrays gates and logic blocks are founded in patents

                                                                                              awarded to David W Page and LuVerne R Peterson in 1985

                                                                                              In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                              proposed by Steve Casselman to develop a computer that would implement

                                                                                              600000 reprogrammable gates Casselman was successful and the system was

                                                                                              awarded a patent in 1992

                                                                                              Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                              1990s when competitors sprouted up eroding significant market-share By

                                                                                              1993 Actel was serving about 18 percent of the market

                                                                                              The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                              and the volume of production In the early 1990s FPGAs were primarily used

                                                                                              in telecommunications and networking By the end of the decade FPGAs

                                                                                              found their way into consumer automotive and industrial applications

                                                                                              FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                              algorithm technology and FPGAs to create a sound recognition device

                                                                                              Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                              to decide the configuration needed to accomplish a sound recognition task

                                                                                              121 Modern developments

                                                                                              A recent trend has been to take the coarse-grained architectural

                                                                                              approach a step further by combining the logic blocks and interconnects of

                                                                                              traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                              form a complete system on a programmable chip This work mirrors the

                                                                                              architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                              Group which combined a reconfigurable CPU architecture on a single chip

                                                                                              called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                              technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                              which include one or more PowerPC processors embedded within the FPGAs

                                                                                              54

                                                                                              logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                              processor in combination with Atmels programmable logic architecture

                                                                                              An alternate approach to using hard-macro processors is to make use of soft

                                                                                              processor cores that are implemented within the FPGA logic (See Soft

                                                                                              processors below)

                                                                                              As previously mentioned many modern FPGAs have the ability to be

                                                                                              reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                              computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                              suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                              example of a reconfigurable soft processor implemented on FPGAs However

                                                                                              it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                              to a specific program

                                                                                              Additionally new non-FPGA architectures are beginning to emerge

                                                                                              Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                              hybrid approach by providing an array of processor cores and FPGA-like

                                                                                              programmable cores on the same chip

                                                                                              Gates

                                                                                              1987 9000 gates Xilinx

                                                                                              1992 600000 Naval Surface Warfare Department

                                                                                              Early 2000s Millions

                                                                                              Market size

                                                                                              1985 First commercial FPGA technology invented by Xilinx

                                                                                              1987 $14 million

                                                                                              ~1993 gt$385 million

                                                                                              2005 $19 billion

                                                                                              2010 estimates $275 billion

                                                                                              55

                                                                                              122 FPGA Comparisons

                                                                                              Historically FPGAs have been slower less energy efficient and generally

                                                                                              achieved less functionality than their fixed ASIC counterparts A combination

                                                                                              of volume fabrication improvements research and development and the IO

                                                                                              capabilities of new supercomputers have largely closed the performance gap

                                                                                              between ASICs and FPGAs

                                                                                              Advantages include a shorter time to market ability to re-program in the field

                                                                                              to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                              middle road by developing their hardware on ordinary FPGAs but

                                                                                              manufacture their final version so it can no longer be modified after the design

                                                                                              has been committed

                                                                                              Xilinx claims that several market and technology dynamics are changing the

                                                                                              ASICFPGA paradigm

                                                                                              IC costs are rising aggressively

                                                                                              ASIC complexity has bolstered development time and costs

                                                                                              RampD resources and headcount is decreasing

                                                                                              Revenue losses for slow time-to-market are increasing

                                                                                              Financial constraints in a poor economy are driving low-cost technologies

                                                                                              These trends make FPGAs a better alternative than ASICs for a growing

                                                                                              number of higher-volume applications than they have been historically used

                                                                                              for which the company blames for the growing number of FPGA design starts

                                                                                              (see History)

                                                                                              The primary differences between CPLDs and FPGAs are architectural A

                                                                                              CPLD has a somewhat restrictive structure consisting of one or more

                                                                                              programmable sum-of-products logic arrays feeding a relatively small number

                                                                                              of clocked registers The result of this is less flexibility with the advantage of

                                                                                              more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                              FPGA architectures on the other hand are dominated by interconnect This

                                                                                              56

                                                                                              makes them far more flexible (in terms of the range of designs that are

                                                                                              practical for implementation within them) but also far more complex to design

                                                                                              for

                                                                                              Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                              FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                              and embedded memories as well as to have logic blocks implement decoders

                                                                                              or mathematical functions

                                                                                              Some FPGAs have the capability of partial re-configuration that lets one

                                                                                              portion of the device be re-programmed while other portions continue running

                                                                                              123 Applications

                                                                                              Applications of FPGAs include digital signal processing software-defined

                                                                                              radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                              computer vision speech recognition cryptography bioinformatics computer

                                                                                              hardware emulation radio astronomy and a growing range of other areas

                                                                                              FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                              space that of glue logic for PCBs As their size capabilities and speed

                                                                                              increased they began to take over larger and larger functions to the state where

                                                                                              some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                              introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                              applications which had traditionally been the sole reserve of DSPs began to

                                                                                              incorporate FPGAs instead

                                                                                              FPGAs especially find applications in any area or algorithm that can make use

                                                                                              of the massive parallelism offered by their architecture One such area is code

                                                                                              breaking in particular brute-force attack of cryptographic algorithms

                                                                                              FPGAs are increasingly used in conventional high performance computing

                                                                                              applications where computational kernels such as FFT or Convolution are

                                                                                              performed on the FPGA instead of a microprocessor

                                                                                              57

                                                                                              The inherent parallelism of the logic resources on an FPGA allows for

                                                                                              considerable computational throughput even at a low MHz clock rates The

                                                                                              flexibility of the FPGA allows for even higher performance by trading off

                                                                                              precision and range in the number format for an increased number of parallel

                                                                                              arithmetic units This has driven a new type of processing called

                                                                                              reconfigurable computing where time intensive tasks are offloaded from

                                                                                              software to FPGAs

                                                                                              The adoption of FPGAs in high performance computing is currently limited by

                                                                                              the complexity of FPGA design compared to conventional software and the

                                                                                              extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                              is necessary after even minor changes to the source code

                                                                                              Traditionally FPGAs have been reserved for specific vertical applications

                                                                                              where the volume of production is small For these low-volume applications

                                                                                              the premium that companies pay in hardware costs per unit for a

                                                                                              programmable chip is more affordable than the development resources spent

                                                                                              on creating an ASIC for a low-volume application Today new cost and

                                                                                              performance dynamics have broadened the range of viable applications

                                                                                              Architecture

                                                                                              The most common FPGA architecture consists of an array of configurable

                                                                                              logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                              channels have the same width (number of wires) Multiple IO pads may fit

                                                                                              into the height of one row or the width of one column in the array

                                                                                              An application circuit must be mapped into an FPGA with adequate resources

                                                                                              While the number of CLBs and IOs required is easily determined from the

                                                                                              design the number of routing tracks needed may vary considerably even

                                                                                              among designs with the same amount of logic (For example a crossbar switch

                                                                                              requires much more routing than a systolic array with the same gate count)

                                                                                              Since unused routing tracks increase the cost (and decrease the performance)

                                                                                              of the part without providing any benefit FPGA manufacturers try to provide

                                                                                              58

                                                                                              just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                              can be routed This is determined by estimates such as those derived from

                                                                                              Rents rule or by experiments with existing designs

                                                                                              The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                              blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                              width of one column as shown below All the routing channels have the same

                                                                                              width (number of wires)

                                                                                              121 FPGA structure

                                                                                              A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                              flip-flop as shown below In recent years manufacturers have started moving

                                                                                              to 6-input LUTs in their high performance parts claiming increased

                                                                                              performance

                                                                                              Typical logic block

                                                                                              59

                                                                                              There is only one output which can be either the registered or the unregistered

                                                                                              LUT output The logic block has four inputs for the LUT and a clock input

                                                                                              Since clock signals (and often other high-fanout signals) are normally routed

                                                                                              via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                              and other signals are separately managed

                                                                                              For this example architecture the locations of the FPGA logic block pins are

                                                                                              shown below

                                                                                              Logic Block Pin Locations

                                                                                              Each input is accessible from one side of the logic block while the output pin

                                                                                              can connect to routing wires in both the channel to the right and the channel

                                                                                              below the logic block

                                                                                              Each logic block output pin can connect to any of the wiring segments in the

                                                                                              channels adjacent to it

                                                                                              Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                              channel adjacent to it For example an IO pad at the top of the chip can

                                                                                              connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                              channel immediately below it

                                                                                              Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                              spans only one logic block before it terminates in a switch box By turning on

                                                                                              60

                                                                                              some of the programmable switches within a switch box longer paths can be

                                                                                              constructed For higher speed interconnect some FPGA architectures use

                                                                                              longer routing lines that span multiple logic blocks

                                                                                              Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                              this architecture when a wire enters a switch box there are three

                                                                                              programmable switches that allow it to connect to three other wires in adjacent

                                                                                              channel segments The pattern or topology of switches used in this

                                                                                              architecture is the planar or domain-based switch box topology In this switch

                                                                                              box topology a wire in track number one connects only to wires in track

                                                                                              number one in adjacent channel segments wires in track number 2 connect

                                                                                              only to other wires in track number 2 and so on The figure below illustrates

                                                                                              the connections in a switch box

                                                                                              Switch box topology

                                                                                              Modern FPGA families expand upon the above capabilities to include higher

                                                                                              level functionality fixed into the silicon Having these common functions

                                                                                              embedded into the silicon reduces the area required and gives those functions

                                                                                              increased speed compared to building them from primitives Examples of these

                                                                                              include multipliers generic DSP blocks embedded processors high speed IO

                                                                                              logic and embedded memories

                                                                                              61

                                                                                              FPGAs are also widely used for systems validation including pre-silicon

                                                                                              validation post-silicon validation and firmware development This allows chip

                                                                                              companies to validate their design before the chip is produced in the factory

                                                                                              reducing the time to market

                                                                                              124 FPGA Design and Programming

                                                                                              To define the behavior of the FPGA the user provides a hardware description

                                                                                              language (HDL) or a schematic design The HDL form might be easier to work

                                                                                              with when handling large structures because its possible to just specify them

                                                                                              numerically rather than having to draw every piece by hand On the other hand

                                                                                              schematic entry can allow for easier visualisation of a design

                                                                                              Then using an electronic design automation tool a technology-mapped netlist

                                                                                              is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                              using a process called place-and-route usually performed by the FPGA

                                                                                              companys proprietary place-and-route software The user will validate the

                                                                                              map place and route results via timing analysis simulation and other

                                                                                              verification methodologies Once the design and validation process is

                                                                                              complete the binary file generated (also using the FPGA companys

                                                                                              proprietary software) is used to (re)configure the FPGA

                                                                                              Going from schematicHDL source files to actual configuration The source

                                                                                              files are fed to a software suite from the FPGACPLD vendor that through

                                                                                              different steps will produce a file This file is then transferred to the

                                                                                              FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                              like an EEPROM

                                                                                              The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                              reduce the complexity of designing in HDLs which have been compared to the

                                                                                              equivalent of assembly languages there are moves to raise the abstraction level

                                                                                              through the introduction of alternative languages

                                                                                              62

                                                                                              To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                              predefined complex functions and circuits that have been tested and optimized

                                                                                              to speed up the design process These predefined circuits are commonly called

                                                                                              IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                              (rarely free and typically released under proprietary licenses) Other

                                                                                              predefined circuits are available from developer communities such as Open

                                                                                              Cores (typically free and released under the GPL BSD or similar license) and

                                                                                              other sources

                                                                                              In a typical design flow an FPGA application developer will simulate the

                                                                                              design at multiple stages throughout the design process Initially the RTL

                                                                                              description in VHDL or Verilog is simulated by creating test benches to

                                                                                              simulate the system and observe results Then after the synthesis engine has

                                                                                              mapped the design to a netlist the netlist is translated to a gate level

                                                                                              description where simulation is repeated to confirm the synthesis proceeded

                                                                                              without errors Finally the design is laid out in the FPGA at which point

                                                                                              propagation delays can be added and the simulation run again with these

                                                                                              values back-annotated onto the netlist

                                                                                              Basic Process Technology Types

                                                                                              SRAM - based on static memory technology In-system programmable and re-

                                                                                              programmable Requires external boot devices CMOS

                                                                                              Antifuse - One-time programmable CMOS

                                                                                              EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                              one-time programmable in production because of plastic packaging

                                                                                              Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                              EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                              technology Can be erased even in plastic packages Some but not all

                                                                                              EEPROM devices can be in-system programmed CMOS

                                                                                              Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                              packages Some but not all flash devices can be in-system programmed

                                                                                              63

                                                                                              Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                              therefore less expensive to manufacture CMOS

                                                                                              Fuse - One-time programmable Bipolar

                                                                                              Major Manufacturers

                                                                                              Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                              rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                              representing over 50 percent

                                                                                              Xilinx also provides free Windows and Linux design software while Altera

                                                                                              provides free Windows tools the Solaris and Linux tools are only available via

                                                                                              a rental scheme

                                                                                              Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                              (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                              Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                              focused CSSP no general purpose FPGAs)

                                                                                              125 FPGA prototype

                                                                                              FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                              prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                              hardware verification and early software development

                                                                                              Main stream verification methods for hardware design and early software and

                                                                                              firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                              design on FPGA has become a good method to do this

                                                                                              64

                                                                                              Reasons why Prototyping is important

                                                                                              1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                              functionally correct This is compared to designers only relying on software

                                                                                              simulations to verify that their hardware design is sound Simulation speed and

                                                                                              modeling accuracy limitations hinder this development

                                                                                              2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                              the foundry to start on software tests FPGA prototyping allows for much more

                                                                                              time in area of software development and testing at the software-hardware

                                                                                              integration stage This allows many unforeseen software bugs that appear due

                                                                                              to todays array of operating systems applications and hardware

                                                                                              3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                              system work well together off the simulation stage and in actual form

                                                                                              4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                              bringing in interest early This speeds up the overall development cycle and

                                                                                              allows for more enhancement or improvement to the chip features as it would

                                                                                              otherwise have been

                                                                                              65

                                                                                              13 SIMULATION RESULTS

                                                                                              66

                                                                                              GATE LEVEL

                                                                                              FigVGA CONTROLLER

                                                                                              67

                                                                                              GATE LEVEL

                                                                                              Fig SYNC UNIT

                                                                                              68

                                                                                              TECHNOLOGY SCHEMATIC

                                                                                              69

                                                                                              14 APPLICATIONS

                                                                                              Motion Tracking Systems

                                                                                              Image processing systems

                                                                                              Displaying Systems

                                                                                              141 CONCLUTION

                                                                                              Hence by implementing FPGA taking a processor-generated picture

                                                                                              (pixilated) from memory space the Controller provides digital RGB values for

                                                                                              each pixel as well as horizontal and vertical synchronization signals in order

                                                                                              to correctly display the picture on a connected monitor

                                                                                              70

                                                                                              REFERENCES

                                                                                              1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                              2 A VGA display controller by Eduardo Sanchez

                                                                                              3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                              71

                                                                                              • Pixel
                                                                                                • 6 SIGNALS IN VGA CONTROLLER
                                                                                                • Gates
                                                                                                • Market size
                                                                                                • 122 FPGA Comparisons

                                                                                                hsync lt= 1

                                                                                                elsif(clk = 1 and clkevent)then

                                                                                                if(c_hs_s lt=95)then

                                                                                                hsync lt= 0

                                                                                                else

                                                                                                hsync lt= 1

                                                                                                end if

                                                                                                end if

                                                                                                --end if

                                                                                                end process

                                                                                                process(clkrstn) ----Horizontal Enable ( Row Information Will be Sent in

                                                                                                this period)

                                                                                                begin

                                                                                                if(rstn = 0)then

                                                                                                h_en_s lt= 0

                                                                                                elsif(clk = 1 and clkevent)then

                                                                                                if((c_hs_s gt=143) and (c_hs_slt=778))then

                                                                                                h_en_s lt= 1

                                                                                                else

                                                                                                h_en_s lt= 0

                                                                                                end if

                                                                                                end if

                                                                                                --end if

                                                                                                end process

                                                                                                ---vysnc pulse

                                                                                                process(clkrstn) -- Vertical Enable (Data is Valid in this Region)

                                                                                                begin

                                                                                                if(rstn = 0)then

                                                                                                48

                                                                                                v_en_s lt= 0

                                                                                                elsif(clk = 1 and clkevent)then

                                                                                                if((line_count_sgt=35 and line_count_slt=515))then

                                                                                                v_en_s lt= 1

                                                                                                else

                                                                                                v_en_s lt= 0

                                                                                                end if

                                                                                                end if

                                                                                                --end if

                                                                                                end process

                                                                                                process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                                begin

                                                                                                if(rstn = 0)then

                                                                                                vsync lt= 1

                                                                                                elsif(clk = 1 and clkevent)then

                                                                                                if(line_count_s lt2)then

                                                                                                vsync lt= 0

                                                                                                else

                                                                                                vsync lt= 1

                                                                                                end if

                                                                                                end if

                                                                                                --end if

                                                                                                end process

                                                                                                process(clk)

                                                                                                begin

                                                                                                if(clk = 1 and clkevent)then

                                                                                                v_en_reglt=v_en_s

                                                                                                end if

                                                                                                end process

                                                                                                end arch_sync_unit

                                                                                                49

                                                                                                PIXEL GENERATOR

                                                                                                library IEEE

                                                                                                use IEEESTD_LOGIC_1164ALL

                                                                                                use IEEESTD_LOGIC_ARITHALL

                                                                                                use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                                entity pixelgen is port(

                                                                                                clkin std_logic

                                                                                                rstnin std_logic

                                                                                                hsyncin std_logic

                                                                                                vsyncin std_logic

                                                                                                h_enin std_logic

                                                                                                v_enin std_logic

                                                                                                v_en_reg in std_logic

                                                                                                data_inin std_logic_vector(0 downto 0)

                                                                                                line_countin integer

                                                                                                c_hsin integer

                                                                                                red out std_logic_vector(0 downto 0)

                                                                                                greenout std_logic_vector(0 downto 0)

                                                                                                blueout std_logic_vector(0 downto 0)

                                                                                                addrout std_logic_vector(17 downto 0)

                                                                                                rdout std_logic

                                                                                                )

                                                                                                end pixelgen

                                                                                                architecture behv of pixelgen is

                                                                                                signal addr_sstd_logic_vector(17 downto 0)

                                                                                                begin

                                                                                                addr lt= addr_s

                                                                                                process(clkrstn)

                                                                                                begin

                                                                                                50

                                                                                                if(rstn = 0)then

                                                                                                rd lt= 0

                                                                                                red lt= 0

                                                                                                green lt= 0

                                                                                                blue lt= 0

                                                                                                addr_s lt= x0000 amp 00

                                                                                                elsif (clk = 1 and clkevent)then

                                                                                                if((v_en=1) and (v_en_reg=0))then

                                                                                                addr_slt=(others =gt0)

                                                                                                end if

                                                                                                if(v_en=1 and h_en=1)then

                                                                                                if(line_countlt=514)then

                                                                                                if(c_hs lt= 655)then

                                                                                                rd lt= 1

                                                                                                red lt= data_in

                                                                                                green lt= data_in

                                                                                                blue lt= data_in

                                                                                                addr_s lt=addr_s +000000000000000001

                                                                                                else

                                                                                                rd lt= 0

                                                                                                red lt= 0

                                                                                                green lt= 1

                                                                                                blue lt= 0

                                                                                                end if

                                                                                                else

                                                                                                rd lt= 0

                                                                                                red lt= 1

                                                                                                green lt= 1

                                                                                                blue lt= 1

                                                                                                51

                                                                                                end if

                                                                                                if(addr_s=111011111111111111)then

                                                                                                addr_slt= (others =gt0)

                                                                                                end if

                                                                                                else

                                                                                                red lt= 0

                                                                                                green lt= 0

                                                                                                blue lt= 0

                                                                                                end if

                                                                                                end if

                                                                                                -- end if

                                                                                                --end if

                                                                                                end process

                                                                                                end behv

                                                                                                52

                                                                                                12 FPGA ARCHITECTURE

                                                                                                A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                                be configured by the customer or designer after manufacturingmdashhence the

                                                                                                name field-programmable FPGAs are programmed using a logic circuit

                                                                                                diagram or a source code in a hardware description language (HDL) to specify

                                                                                                how the chip will work They can be used to implement any logical function

                                                                                                that an application-specific integrated circuit (ASIC) could perform but the

                                                                                                ability to update the functionality after shipping offers advantages for many

                                                                                                applications

                                                                                                FPGAs contain programmable logic components called logic blocks and a

                                                                                                hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                                togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                                can be configured to perform complex combinational functions or merely

                                                                                                simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                                include memory elements which may be simple flip-flops or more complete

                                                                                                blocks of memory

                                                                                                History The FPGA industry sprouted from programmable read only

                                                                                                memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                                both had the option of being programmed in batches in a factory or in the field

                                                                                                (field programmable) however programmable logic was hard-wired between

                                                                                                logic gates

                                                                                                Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                                first commercially viable field programmable gate array in 1985 ndash the

                                                                                                XC2064 The XC2064 had programmable gates and programmable

                                                                                                interconnects between gates the beginnings of a new technology and market

                                                                                                The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                                input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                                into the National Inventors Hall of Fame for his invention

                                                                                                53

                                                                                                Some of the industryrsquos foundational concepts and technologies for

                                                                                                programmable logic arrays gates and logic blocks are founded in patents

                                                                                                awarded to David W Page and LuVerne R Peterson in 1985

                                                                                                In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                                proposed by Steve Casselman to develop a computer that would implement

                                                                                                600000 reprogrammable gates Casselman was successful and the system was

                                                                                                awarded a patent in 1992

                                                                                                Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                                1990s when competitors sprouted up eroding significant market-share By

                                                                                                1993 Actel was serving about 18 percent of the market

                                                                                                The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                                and the volume of production In the early 1990s FPGAs were primarily used

                                                                                                in telecommunications and networking By the end of the decade FPGAs

                                                                                                found their way into consumer automotive and industrial applications

                                                                                                FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                                algorithm technology and FPGAs to create a sound recognition device

                                                                                                Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                                to decide the configuration needed to accomplish a sound recognition task

                                                                                                121 Modern developments

                                                                                                A recent trend has been to take the coarse-grained architectural

                                                                                                approach a step further by combining the logic blocks and interconnects of

                                                                                                traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                                form a complete system on a programmable chip This work mirrors the

                                                                                                architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                                Group which combined a reconfigurable CPU architecture on a single chip

                                                                                                called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                                technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                                which include one or more PowerPC processors embedded within the FPGAs

                                                                                                54

                                                                                                logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                                processor in combination with Atmels programmable logic architecture

                                                                                                An alternate approach to using hard-macro processors is to make use of soft

                                                                                                processor cores that are implemented within the FPGA logic (See Soft

                                                                                                processors below)

                                                                                                As previously mentioned many modern FPGAs have the ability to be

                                                                                                reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                                computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                                suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                                example of a reconfigurable soft processor implemented on FPGAs However

                                                                                                it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                                to a specific program

                                                                                                Additionally new non-FPGA architectures are beginning to emerge

                                                                                                Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                                hybrid approach by providing an array of processor cores and FPGA-like

                                                                                                programmable cores on the same chip

                                                                                                Gates

                                                                                                1987 9000 gates Xilinx

                                                                                                1992 600000 Naval Surface Warfare Department

                                                                                                Early 2000s Millions

                                                                                                Market size

                                                                                                1985 First commercial FPGA technology invented by Xilinx

                                                                                                1987 $14 million

                                                                                                ~1993 gt$385 million

                                                                                                2005 $19 billion

                                                                                                2010 estimates $275 billion

                                                                                                55

                                                                                                122 FPGA Comparisons

                                                                                                Historically FPGAs have been slower less energy efficient and generally

                                                                                                achieved less functionality than their fixed ASIC counterparts A combination

                                                                                                of volume fabrication improvements research and development and the IO

                                                                                                capabilities of new supercomputers have largely closed the performance gap

                                                                                                between ASICs and FPGAs

                                                                                                Advantages include a shorter time to market ability to re-program in the field

                                                                                                to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                                middle road by developing their hardware on ordinary FPGAs but

                                                                                                manufacture their final version so it can no longer be modified after the design

                                                                                                has been committed

                                                                                                Xilinx claims that several market and technology dynamics are changing the

                                                                                                ASICFPGA paradigm

                                                                                                IC costs are rising aggressively

                                                                                                ASIC complexity has bolstered development time and costs

                                                                                                RampD resources and headcount is decreasing

                                                                                                Revenue losses for slow time-to-market are increasing

                                                                                                Financial constraints in a poor economy are driving low-cost technologies

                                                                                                These trends make FPGAs a better alternative than ASICs for a growing

                                                                                                number of higher-volume applications than they have been historically used

                                                                                                for which the company blames for the growing number of FPGA design starts

                                                                                                (see History)

                                                                                                The primary differences between CPLDs and FPGAs are architectural A

                                                                                                CPLD has a somewhat restrictive structure consisting of one or more

                                                                                                programmable sum-of-products logic arrays feeding a relatively small number

                                                                                                of clocked registers The result of this is less flexibility with the advantage of

                                                                                                more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                                FPGA architectures on the other hand are dominated by interconnect This

                                                                                                56

                                                                                                makes them far more flexible (in terms of the range of designs that are

                                                                                                practical for implementation within them) but also far more complex to design

                                                                                                for

                                                                                                Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                and embedded memories as well as to have logic blocks implement decoders

                                                                                                or mathematical functions

                                                                                                Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                portion of the device be re-programmed while other portions continue running

                                                                                                123 Applications

                                                                                                Applications of FPGAs include digital signal processing software-defined

                                                                                                radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                computer vision speech recognition cryptography bioinformatics computer

                                                                                                hardware emulation radio astronomy and a growing range of other areas

                                                                                                FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                space that of glue logic for PCBs As their size capabilities and speed

                                                                                                increased they began to take over larger and larger functions to the state where

                                                                                                some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                applications which had traditionally been the sole reserve of DSPs began to

                                                                                                incorporate FPGAs instead

                                                                                                FPGAs especially find applications in any area or algorithm that can make use

                                                                                                of the massive parallelism offered by their architecture One such area is code

                                                                                                breaking in particular brute-force attack of cryptographic algorithms

                                                                                                FPGAs are increasingly used in conventional high performance computing

                                                                                                applications where computational kernels such as FFT or Convolution are

                                                                                                performed on the FPGA instead of a microprocessor

                                                                                                57

                                                                                                The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                considerable computational throughput even at a low MHz clock rates The

                                                                                                flexibility of the FPGA allows for even higher performance by trading off

                                                                                                precision and range in the number format for an increased number of parallel

                                                                                                arithmetic units This has driven a new type of processing called

                                                                                                reconfigurable computing where time intensive tasks are offloaded from

                                                                                                software to FPGAs

                                                                                                The adoption of FPGAs in high performance computing is currently limited by

                                                                                                the complexity of FPGA design compared to conventional software and the

                                                                                                extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                is necessary after even minor changes to the source code

                                                                                                Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                where the volume of production is small For these low-volume applications

                                                                                                the premium that companies pay in hardware costs per unit for a

                                                                                                programmable chip is more affordable than the development resources spent

                                                                                                on creating an ASIC for a low-volume application Today new cost and

                                                                                                performance dynamics have broadened the range of viable applications

                                                                                                Architecture

                                                                                                The most common FPGA architecture consists of an array of configurable

                                                                                                logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                into the height of one row or the width of one column in the array

                                                                                                An application circuit must be mapped into an FPGA with adequate resources

                                                                                                While the number of CLBs and IOs required is easily determined from the

                                                                                                design the number of routing tracks needed may vary considerably even

                                                                                                among designs with the same amount of logic (For example a crossbar switch

                                                                                                requires much more routing than a systolic array with the same gate count)

                                                                                                Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                58

                                                                                                just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                can be routed This is determined by estimates such as those derived from

                                                                                                Rents rule or by experiments with existing designs

                                                                                                The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                width of one column as shown below All the routing channels have the same

                                                                                                width (number of wires)

                                                                                                121 FPGA structure

                                                                                                A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                flip-flop as shown below In recent years manufacturers have started moving

                                                                                                to 6-input LUTs in their high performance parts claiming increased

                                                                                                performance

                                                                                                Typical logic block

                                                                                                59

                                                                                                There is only one output which can be either the registered or the unregistered

                                                                                                LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                and other signals are separately managed

                                                                                                For this example architecture the locations of the FPGA logic block pins are

                                                                                                shown below

                                                                                                Logic Block Pin Locations

                                                                                                Each input is accessible from one side of the logic block while the output pin

                                                                                                can connect to routing wires in both the channel to the right and the channel

                                                                                                below the logic block

                                                                                                Each logic block output pin can connect to any of the wiring segments in the

                                                                                                channels adjacent to it

                                                                                                Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                channel immediately below it

                                                                                                Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                spans only one logic block before it terminates in a switch box By turning on

                                                                                                60

                                                                                                some of the programmable switches within a switch box longer paths can be

                                                                                                constructed For higher speed interconnect some FPGA architectures use

                                                                                                longer routing lines that span multiple logic blocks

                                                                                                Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                this architecture when a wire enters a switch box there are three

                                                                                                programmable switches that allow it to connect to three other wires in adjacent

                                                                                                channel segments The pattern or topology of switches used in this

                                                                                                architecture is the planar or domain-based switch box topology In this switch

                                                                                                box topology a wire in track number one connects only to wires in track

                                                                                                number one in adjacent channel segments wires in track number 2 connect

                                                                                                only to other wires in track number 2 and so on The figure below illustrates

                                                                                                the connections in a switch box

                                                                                                Switch box topology

                                                                                                Modern FPGA families expand upon the above capabilities to include higher

                                                                                                level functionality fixed into the silicon Having these common functions

                                                                                                embedded into the silicon reduces the area required and gives those functions

                                                                                                increased speed compared to building them from primitives Examples of these

                                                                                                include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                logic and embedded memories

                                                                                                61

                                                                                                FPGAs are also widely used for systems validation including pre-silicon

                                                                                                validation post-silicon validation and firmware development This allows chip

                                                                                                companies to validate their design before the chip is produced in the factory

                                                                                                reducing the time to market

                                                                                                124 FPGA Design and Programming

                                                                                                To define the behavior of the FPGA the user provides a hardware description

                                                                                                language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                with when handling large structures because its possible to just specify them

                                                                                                numerically rather than having to draw every piece by hand On the other hand

                                                                                                schematic entry can allow for easier visualisation of a design

                                                                                                Then using an electronic design automation tool a technology-mapped netlist

                                                                                                is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                using a process called place-and-route usually performed by the FPGA

                                                                                                companys proprietary place-and-route software The user will validate the

                                                                                                map place and route results via timing analysis simulation and other

                                                                                                verification methodologies Once the design and validation process is

                                                                                                complete the binary file generated (also using the FPGA companys

                                                                                                proprietary software) is used to (re)configure the FPGA

                                                                                                Going from schematicHDL source files to actual configuration The source

                                                                                                files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                different steps will produce a file This file is then transferred to the

                                                                                                FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                like an EEPROM

                                                                                                The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                reduce the complexity of designing in HDLs which have been compared to the

                                                                                                equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                through the introduction of alternative languages

                                                                                                62

                                                                                                To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                predefined complex functions and circuits that have been tested and optimized

                                                                                                to speed up the design process These predefined circuits are commonly called

                                                                                                IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                (rarely free and typically released under proprietary licenses) Other

                                                                                                predefined circuits are available from developer communities such as Open

                                                                                                Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                other sources

                                                                                                In a typical design flow an FPGA application developer will simulate the

                                                                                                design at multiple stages throughout the design process Initially the RTL

                                                                                                description in VHDL or Verilog is simulated by creating test benches to

                                                                                                simulate the system and observe results Then after the synthesis engine has

                                                                                                mapped the design to a netlist the netlist is translated to a gate level

                                                                                                description where simulation is repeated to confirm the synthesis proceeded

                                                                                                without errors Finally the design is laid out in the FPGA at which point

                                                                                                propagation delays can be added and the simulation run again with these

                                                                                                values back-annotated onto the netlist

                                                                                                Basic Process Technology Types

                                                                                                SRAM - based on static memory technology In-system programmable and re-

                                                                                                programmable Requires external boot devices CMOS

                                                                                                Antifuse - One-time programmable CMOS

                                                                                                EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                one-time programmable in production because of plastic packaging

                                                                                                Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                technology Can be erased even in plastic packages Some but not all

                                                                                                EEPROM devices can be in-system programmed CMOS

                                                                                                Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                packages Some but not all flash devices can be in-system programmed

                                                                                                63

                                                                                                Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                therefore less expensive to manufacture CMOS

                                                                                                Fuse - One-time programmable Bipolar

                                                                                                Major Manufacturers

                                                                                                Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                representing over 50 percent

                                                                                                Xilinx also provides free Windows and Linux design software while Altera

                                                                                                provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                a rental scheme

                                                                                                Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                focused CSSP no general purpose FPGAs)

                                                                                                125 FPGA prototype

                                                                                                FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                hardware verification and early software development

                                                                                                Main stream verification methods for hardware design and early software and

                                                                                                firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                design on FPGA has become a good method to do this

                                                                                                64

                                                                                                Reasons why Prototyping is important

                                                                                                1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                functionally correct This is compared to designers only relying on software

                                                                                                simulations to verify that their hardware design is sound Simulation speed and

                                                                                                modeling accuracy limitations hinder this development

                                                                                                2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                time in area of software development and testing at the software-hardware

                                                                                                integration stage This allows many unforeseen software bugs that appear due

                                                                                                to todays array of operating systems applications and hardware

                                                                                                3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                system work well together off the simulation stage and in actual form

                                                                                                4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                bringing in interest early This speeds up the overall development cycle and

                                                                                                allows for more enhancement or improvement to the chip features as it would

                                                                                                otherwise have been

                                                                                                65

                                                                                                13 SIMULATION RESULTS

                                                                                                66

                                                                                                GATE LEVEL

                                                                                                FigVGA CONTROLLER

                                                                                                67

                                                                                                GATE LEVEL

                                                                                                Fig SYNC UNIT

                                                                                                68

                                                                                                TECHNOLOGY SCHEMATIC

                                                                                                69

                                                                                                14 APPLICATIONS

                                                                                                Motion Tracking Systems

                                                                                                Image processing systems

                                                                                                Displaying Systems

                                                                                                141 CONCLUTION

                                                                                                Hence by implementing FPGA taking a processor-generated picture

                                                                                                (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                to correctly display the picture on a connected monitor

                                                                                                70

                                                                                                REFERENCES

                                                                                                1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                2 A VGA display controller by Eduardo Sanchez

                                                                                                3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                71

                                                                                                • Pixel
                                                                                                  • 6 SIGNALS IN VGA CONTROLLER
                                                                                                  • Gates
                                                                                                  • Market size
                                                                                                  • 122 FPGA Comparisons

                                                                                                  v_en_s lt= 0

                                                                                                  elsif(clk = 1 and clkevent)then

                                                                                                  if((line_count_sgt=35 and line_count_slt=515))then

                                                                                                  v_en_s lt= 1

                                                                                                  else

                                                                                                  v_en_s lt= 0

                                                                                                  end if

                                                                                                  end if

                                                                                                  --end if

                                                                                                  end process

                                                                                                  process(clkrstn) --Vertical Sync Front porch Back Porch

                                                                                                  begin

                                                                                                  if(rstn = 0)then

                                                                                                  vsync lt= 1

                                                                                                  elsif(clk = 1 and clkevent)then

                                                                                                  if(line_count_s lt2)then

                                                                                                  vsync lt= 0

                                                                                                  else

                                                                                                  vsync lt= 1

                                                                                                  end if

                                                                                                  end if

                                                                                                  --end if

                                                                                                  end process

                                                                                                  process(clk)

                                                                                                  begin

                                                                                                  if(clk = 1 and clkevent)then

                                                                                                  v_en_reglt=v_en_s

                                                                                                  end if

                                                                                                  end process

                                                                                                  end arch_sync_unit

                                                                                                  49

                                                                                                  PIXEL GENERATOR

                                                                                                  library IEEE

                                                                                                  use IEEESTD_LOGIC_1164ALL

                                                                                                  use IEEESTD_LOGIC_ARITHALL

                                                                                                  use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                                  entity pixelgen is port(

                                                                                                  clkin std_logic

                                                                                                  rstnin std_logic

                                                                                                  hsyncin std_logic

                                                                                                  vsyncin std_logic

                                                                                                  h_enin std_logic

                                                                                                  v_enin std_logic

                                                                                                  v_en_reg in std_logic

                                                                                                  data_inin std_logic_vector(0 downto 0)

                                                                                                  line_countin integer

                                                                                                  c_hsin integer

                                                                                                  red out std_logic_vector(0 downto 0)

                                                                                                  greenout std_logic_vector(0 downto 0)

                                                                                                  blueout std_logic_vector(0 downto 0)

                                                                                                  addrout std_logic_vector(17 downto 0)

                                                                                                  rdout std_logic

                                                                                                  )

                                                                                                  end pixelgen

                                                                                                  architecture behv of pixelgen is

                                                                                                  signal addr_sstd_logic_vector(17 downto 0)

                                                                                                  begin

                                                                                                  addr lt= addr_s

                                                                                                  process(clkrstn)

                                                                                                  begin

                                                                                                  50

                                                                                                  if(rstn = 0)then

                                                                                                  rd lt= 0

                                                                                                  red lt= 0

                                                                                                  green lt= 0

                                                                                                  blue lt= 0

                                                                                                  addr_s lt= x0000 amp 00

                                                                                                  elsif (clk = 1 and clkevent)then

                                                                                                  if((v_en=1) and (v_en_reg=0))then

                                                                                                  addr_slt=(others =gt0)

                                                                                                  end if

                                                                                                  if(v_en=1 and h_en=1)then

                                                                                                  if(line_countlt=514)then

                                                                                                  if(c_hs lt= 655)then

                                                                                                  rd lt= 1

                                                                                                  red lt= data_in

                                                                                                  green lt= data_in

                                                                                                  blue lt= data_in

                                                                                                  addr_s lt=addr_s +000000000000000001

                                                                                                  else

                                                                                                  rd lt= 0

                                                                                                  red lt= 0

                                                                                                  green lt= 1

                                                                                                  blue lt= 0

                                                                                                  end if

                                                                                                  else

                                                                                                  rd lt= 0

                                                                                                  red lt= 1

                                                                                                  green lt= 1

                                                                                                  blue lt= 1

                                                                                                  51

                                                                                                  end if

                                                                                                  if(addr_s=111011111111111111)then

                                                                                                  addr_slt= (others =gt0)

                                                                                                  end if

                                                                                                  else

                                                                                                  red lt= 0

                                                                                                  green lt= 0

                                                                                                  blue lt= 0

                                                                                                  end if

                                                                                                  end if

                                                                                                  -- end if

                                                                                                  --end if

                                                                                                  end process

                                                                                                  end behv

                                                                                                  52

                                                                                                  12 FPGA ARCHITECTURE

                                                                                                  A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                                  be configured by the customer or designer after manufacturingmdashhence the

                                                                                                  name field-programmable FPGAs are programmed using a logic circuit

                                                                                                  diagram or a source code in a hardware description language (HDL) to specify

                                                                                                  how the chip will work They can be used to implement any logical function

                                                                                                  that an application-specific integrated circuit (ASIC) could perform but the

                                                                                                  ability to update the functionality after shipping offers advantages for many

                                                                                                  applications

                                                                                                  FPGAs contain programmable logic components called logic blocks and a

                                                                                                  hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                                  togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                                  can be configured to perform complex combinational functions or merely

                                                                                                  simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                                  include memory elements which may be simple flip-flops or more complete

                                                                                                  blocks of memory

                                                                                                  History The FPGA industry sprouted from programmable read only

                                                                                                  memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                                  both had the option of being programmed in batches in a factory or in the field

                                                                                                  (field programmable) however programmable logic was hard-wired between

                                                                                                  logic gates

                                                                                                  Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                                  first commercially viable field programmable gate array in 1985 ndash the

                                                                                                  XC2064 The XC2064 had programmable gates and programmable

                                                                                                  interconnects between gates the beginnings of a new technology and market

                                                                                                  The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                                  input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                                  into the National Inventors Hall of Fame for his invention

                                                                                                  53

                                                                                                  Some of the industryrsquos foundational concepts and technologies for

                                                                                                  programmable logic arrays gates and logic blocks are founded in patents

                                                                                                  awarded to David W Page and LuVerne R Peterson in 1985

                                                                                                  In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                                  proposed by Steve Casselman to develop a computer that would implement

                                                                                                  600000 reprogrammable gates Casselman was successful and the system was

                                                                                                  awarded a patent in 1992

                                                                                                  Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                                  1990s when competitors sprouted up eroding significant market-share By

                                                                                                  1993 Actel was serving about 18 percent of the market

                                                                                                  The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                                  and the volume of production In the early 1990s FPGAs were primarily used

                                                                                                  in telecommunications and networking By the end of the decade FPGAs

                                                                                                  found their way into consumer automotive and industrial applications

                                                                                                  FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                                  algorithm technology and FPGAs to create a sound recognition device

                                                                                                  Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                                  to decide the configuration needed to accomplish a sound recognition task

                                                                                                  121 Modern developments

                                                                                                  A recent trend has been to take the coarse-grained architectural

                                                                                                  approach a step further by combining the logic blocks and interconnects of

                                                                                                  traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                                  form a complete system on a programmable chip This work mirrors the

                                                                                                  architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                                  Group which combined a reconfigurable CPU architecture on a single chip

                                                                                                  called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                                  technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                                  which include one or more PowerPC processors embedded within the FPGAs

                                                                                                  54

                                                                                                  logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                                  processor in combination with Atmels programmable logic architecture

                                                                                                  An alternate approach to using hard-macro processors is to make use of soft

                                                                                                  processor cores that are implemented within the FPGA logic (See Soft

                                                                                                  processors below)

                                                                                                  As previously mentioned many modern FPGAs have the ability to be

                                                                                                  reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                                  computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                                  suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                                  example of a reconfigurable soft processor implemented on FPGAs However

                                                                                                  it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                                  to a specific program

                                                                                                  Additionally new non-FPGA architectures are beginning to emerge

                                                                                                  Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                                  hybrid approach by providing an array of processor cores and FPGA-like

                                                                                                  programmable cores on the same chip

                                                                                                  Gates

                                                                                                  1987 9000 gates Xilinx

                                                                                                  1992 600000 Naval Surface Warfare Department

                                                                                                  Early 2000s Millions

                                                                                                  Market size

                                                                                                  1985 First commercial FPGA technology invented by Xilinx

                                                                                                  1987 $14 million

                                                                                                  ~1993 gt$385 million

                                                                                                  2005 $19 billion

                                                                                                  2010 estimates $275 billion

                                                                                                  55

                                                                                                  122 FPGA Comparisons

                                                                                                  Historically FPGAs have been slower less energy efficient and generally

                                                                                                  achieved less functionality than their fixed ASIC counterparts A combination

                                                                                                  of volume fabrication improvements research and development and the IO

                                                                                                  capabilities of new supercomputers have largely closed the performance gap

                                                                                                  between ASICs and FPGAs

                                                                                                  Advantages include a shorter time to market ability to re-program in the field

                                                                                                  to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                                  middle road by developing their hardware on ordinary FPGAs but

                                                                                                  manufacture their final version so it can no longer be modified after the design

                                                                                                  has been committed

                                                                                                  Xilinx claims that several market and technology dynamics are changing the

                                                                                                  ASICFPGA paradigm

                                                                                                  IC costs are rising aggressively

                                                                                                  ASIC complexity has bolstered development time and costs

                                                                                                  RampD resources and headcount is decreasing

                                                                                                  Revenue losses for slow time-to-market are increasing

                                                                                                  Financial constraints in a poor economy are driving low-cost technologies

                                                                                                  These trends make FPGAs a better alternative than ASICs for a growing

                                                                                                  number of higher-volume applications than they have been historically used

                                                                                                  for which the company blames for the growing number of FPGA design starts

                                                                                                  (see History)

                                                                                                  The primary differences between CPLDs and FPGAs are architectural A

                                                                                                  CPLD has a somewhat restrictive structure consisting of one or more

                                                                                                  programmable sum-of-products logic arrays feeding a relatively small number

                                                                                                  of clocked registers The result of this is less flexibility with the advantage of

                                                                                                  more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                                  FPGA architectures on the other hand are dominated by interconnect This

                                                                                                  56

                                                                                                  makes them far more flexible (in terms of the range of designs that are

                                                                                                  practical for implementation within them) but also far more complex to design

                                                                                                  for

                                                                                                  Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                  FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                  and embedded memories as well as to have logic blocks implement decoders

                                                                                                  or mathematical functions

                                                                                                  Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                  portion of the device be re-programmed while other portions continue running

                                                                                                  123 Applications

                                                                                                  Applications of FPGAs include digital signal processing software-defined

                                                                                                  radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                  computer vision speech recognition cryptography bioinformatics computer

                                                                                                  hardware emulation radio astronomy and a growing range of other areas

                                                                                                  FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                  space that of glue logic for PCBs As their size capabilities and speed

                                                                                                  increased they began to take over larger and larger functions to the state where

                                                                                                  some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                  introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                  applications which had traditionally been the sole reserve of DSPs began to

                                                                                                  incorporate FPGAs instead

                                                                                                  FPGAs especially find applications in any area or algorithm that can make use

                                                                                                  of the massive parallelism offered by their architecture One such area is code

                                                                                                  breaking in particular brute-force attack of cryptographic algorithms

                                                                                                  FPGAs are increasingly used in conventional high performance computing

                                                                                                  applications where computational kernels such as FFT or Convolution are

                                                                                                  performed on the FPGA instead of a microprocessor

                                                                                                  57

                                                                                                  The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                  considerable computational throughput even at a low MHz clock rates The

                                                                                                  flexibility of the FPGA allows for even higher performance by trading off

                                                                                                  precision and range in the number format for an increased number of parallel

                                                                                                  arithmetic units This has driven a new type of processing called

                                                                                                  reconfigurable computing where time intensive tasks are offloaded from

                                                                                                  software to FPGAs

                                                                                                  The adoption of FPGAs in high performance computing is currently limited by

                                                                                                  the complexity of FPGA design compared to conventional software and the

                                                                                                  extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                  is necessary after even minor changes to the source code

                                                                                                  Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                  where the volume of production is small For these low-volume applications

                                                                                                  the premium that companies pay in hardware costs per unit for a

                                                                                                  programmable chip is more affordable than the development resources spent

                                                                                                  on creating an ASIC for a low-volume application Today new cost and

                                                                                                  performance dynamics have broadened the range of viable applications

                                                                                                  Architecture

                                                                                                  The most common FPGA architecture consists of an array of configurable

                                                                                                  logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                  channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                  into the height of one row or the width of one column in the array

                                                                                                  An application circuit must be mapped into an FPGA with adequate resources

                                                                                                  While the number of CLBs and IOs required is easily determined from the

                                                                                                  design the number of routing tracks needed may vary considerably even

                                                                                                  among designs with the same amount of logic (For example a crossbar switch

                                                                                                  requires much more routing than a systolic array with the same gate count)

                                                                                                  Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                  of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                  58

                                                                                                  just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                  can be routed This is determined by estimates such as those derived from

                                                                                                  Rents rule or by experiments with existing designs

                                                                                                  The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                  blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                  width of one column as shown below All the routing channels have the same

                                                                                                  width (number of wires)

                                                                                                  121 FPGA structure

                                                                                                  A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                  flip-flop as shown below In recent years manufacturers have started moving

                                                                                                  to 6-input LUTs in their high performance parts claiming increased

                                                                                                  performance

                                                                                                  Typical logic block

                                                                                                  59

                                                                                                  There is only one output which can be either the registered or the unregistered

                                                                                                  LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                  Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                  via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                  and other signals are separately managed

                                                                                                  For this example architecture the locations of the FPGA logic block pins are

                                                                                                  shown below

                                                                                                  Logic Block Pin Locations

                                                                                                  Each input is accessible from one side of the logic block while the output pin

                                                                                                  can connect to routing wires in both the channel to the right and the channel

                                                                                                  below the logic block

                                                                                                  Each logic block output pin can connect to any of the wiring segments in the

                                                                                                  channels adjacent to it

                                                                                                  Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                  channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                  connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                  channel immediately below it

                                                                                                  Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                  spans only one logic block before it terminates in a switch box By turning on

                                                                                                  60

                                                                                                  some of the programmable switches within a switch box longer paths can be

                                                                                                  constructed For higher speed interconnect some FPGA architectures use

                                                                                                  longer routing lines that span multiple logic blocks

                                                                                                  Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                  this architecture when a wire enters a switch box there are three

                                                                                                  programmable switches that allow it to connect to three other wires in adjacent

                                                                                                  channel segments The pattern or topology of switches used in this

                                                                                                  architecture is the planar or domain-based switch box topology In this switch

                                                                                                  box topology a wire in track number one connects only to wires in track

                                                                                                  number one in adjacent channel segments wires in track number 2 connect

                                                                                                  only to other wires in track number 2 and so on The figure below illustrates

                                                                                                  the connections in a switch box

                                                                                                  Switch box topology

                                                                                                  Modern FPGA families expand upon the above capabilities to include higher

                                                                                                  level functionality fixed into the silicon Having these common functions

                                                                                                  embedded into the silicon reduces the area required and gives those functions

                                                                                                  increased speed compared to building them from primitives Examples of these

                                                                                                  include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                  logic and embedded memories

                                                                                                  61

                                                                                                  FPGAs are also widely used for systems validation including pre-silicon

                                                                                                  validation post-silicon validation and firmware development This allows chip

                                                                                                  companies to validate their design before the chip is produced in the factory

                                                                                                  reducing the time to market

                                                                                                  124 FPGA Design and Programming

                                                                                                  To define the behavior of the FPGA the user provides a hardware description

                                                                                                  language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                  with when handling large structures because its possible to just specify them

                                                                                                  numerically rather than having to draw every piece by hand On the other hand

                                                                                                  schematic entry can allow for easier visualisation of a design

                                                                                                  Then using an electronic design automation tool a technology-mapped netlist

                                                                                                  is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                  using a process called place-and-route usually performed by the FPGA

                                                                                                  companys proprietary place-and-route software The user will validate the

                                                                                                  map place and route results via timing analysis simulation and other

                                                                                                  verification methodologies Once the design and validation process is

                                                                                                  complete the binary file generated (also using the FPGA companys

                                                                                                  proprietary software) is used to (re)configure the FPGA

                                                                                                  Going from schematicHDL source files to actual configuration The source

                                                                                                  files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                  different steps will produce a file This file is then transferred to the

                                                                                                  FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                  like an EEPROM

                                                                                                  The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                  reduce the complexity of designing in HDLs which have been compared to the

                                                                                                  equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                  through the introduction of alternative languages

                                                                                                  62

                                                                                                  To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                  predefined complex functions and circuits that have been tested and optimized

                                                                                                  to speed up the design process These predefined circuits are commonly called

                                                                                                  IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                  (rarely free and typically released under proprietary licenses) Other

                                                                                                  predefined circuits are available from developer communities such as Open

                                                                                                  Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                  other sources

                                                                                                  In a typical design flow an FPGA application developer will simulate the

                                                                                                  design at multiple stages throughout the design process Initially the RTL

                                                                                                  description in VHDL or Verilog is simulated by creating test benches to

                                                                                                  simulate the system and observe results Then after the synthesis engine has

                                                                                                  mapped the design to a netlist the netlist is translated to a gate level

                                                                                                  description where simulation is repeated to confirm the synthesis proceeded

                                                                                                  without errors Finally the design is laid out in the FPGA at which point

                                                                                                  propagation delays can be added and the simulation run again with these

                                                                                                  values back-annotated onto the netlist

                                                                                                  Basic Process Technology Types

                                                                                                  SRAM - based on static memory technology In-system programmable and re-

                                                                                                  programmable Requires external boot devices CMOS

                                                                                                  Antifuse - One-time programmable CMOS

                                                                                                  EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                  one-time programmable in production because of plastic packaging

                                                                                                  Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                  EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                  technology Can be erased even in plastic packages Some but not all

                                                                                                  EEPROM devices can be in-system programmed CMOS

                                                                                                  Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                  packages Some but not all flash devices can be in-system programmed

                                                                                                  63

                                                                                                  Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                  therefore less expensive to manufacture CMOS

                                                                                                  Fuse - One-time programmable Bipolar

                                                                                                  Major Manufacturers

                                                                                                  Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                  rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                  representing over 50 percent

                                                                                                  Xilinx also provides free Windows and Linux design software while Altera

                                                                                                  provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                  a rental scheme

                                                                                                  Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                  (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                  Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                  focused CSSP no general purpose FPGAs)

                                                                                                  125 FPGA prototype

                                                                                                  FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                  prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                  hardware verification and early software development

                                                                                                  Main stream verification methods for hardware design and early software and

                                                                                                  firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                  design on FPGA has become a good method to do this

                                                                                                  64

                                                                                                  Reasons why Prototyping is important

                                                                                                  1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                  functionally correct This is compared to designers only relying on software

                                                                                                  simulations to verify that their hardware design is sound Simulation speed and

                                                                                                  modeling accuracy limitations hinder this development

                                                                                                  2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                  the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                  time in area of software development and testing at the software-hardware

                                                                                                  integration stage This allows many unforeseen software bugs that appear due

                                                                                                  to todays array of operating systems applications and hardware

                                                                                                  3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                  system work well together off the simulation stage and in actual form

                                                                                                  4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                  bringing in interest early This speeds up the overall development cycle and

                                                                                                  allows for more enhancement or improvement to the chip features as it would

                                                                                                  otherwise have been

                                                                                                  65

                                                                                                  13 SIMULATION RESULTS

                                                                                                  66

                                                                                                  GATE LEVEL

                                                                                                  FigVGA CONTROLLER

                                                                                                  67

                                                                                                  GATE LEVEL

                                                                                                  Fig SYNC UNIT

                                                                                                  68

                                                                                                  TECHNOLOGY SCHEMATIC

                                                                                                  69

                                                                                                  14 APPLICATIONS

                                                                                                  Motion Tracking Systems

                                                                                                  Image processing systems

                                                                                                  Displaying Systems

                                                                                                  141 CONCLUTION

                                                                                                  Hence by implementing FPGA taking a processor-generated picture

                                                                                                  (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                  each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                  to correctly display the picture on a connected monitor

                                                                                                  70

                                                                                                  REFERENCES

                                                                                                  1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                  2 A VGA display controller by Eduardo Sanchez

                                                                                                  3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                  71

                                                                                                  • Pixel
                                                                                                    • 6 SIGNALS IN VGA CONTROLLER
                                                                                                    • Gates
                                                                                                    • Market size
                                                                                                    • 122 FPGA Comparisons

                                                                                                    PIXEL GENERATOR

                                                                                                    library IEEE

                                                                                                    use IEEESTD_LOGIC_1164ALL

                                                                                                    use IEEESTD_LOGIC_ARITHALL

                                                                                                    use IEEESTD_LOGIC_UNSIGNEDALL

                                                                                                    entity pixelgen is port(

                                                                                                    clkin std_logic

                                                                                                    rstnin std_logic

                                                                                                    hsyncin std_logic

                                                                                                    vsyncin std_logic

                                                                                                    h_enin std_logic

                                                                                                    v_enin std_logic

                                                                                                    v_en_reg in std_logic

                                                                                                    data_inin std_logic_vector(0 downto 0)

                                                                                                    line_countin integer

                                                                                                    c_hsin integer

                                                                                                    red out std_logic_vector(0 downto 0)

                                                                                                    greenout std_logic_vector(0 downto 0)

                                                                                                    blueout std_logic_vector(0 downto 0)

                                                                                                    addrout std_logic_vector(17 downto 0)

                                                                                                    rdout std_logic

                                                                                                    )

                                                                                                    end pixelgen

                                                                                                    architecture behv of pixelgen is

                                                                                                    signal addr_sstd_logic_vector(17 downto 0)

                                                                                                    begin

                                                                                                    addr lt= addr_s

                                                                                                    process(clkrstn)

                                                                                                    begin

                                                                                                    50

                                                                                                    if(rstn = 0)then

                                                                                                    rd lt= 0

                                                                                                    red lt= 0

                                                                                                    green lt= 0

                                                                                                    blue lt= 0

                                                                                                    addr_s lt= x0000 amp 00

                                                                                                    elsif (clk = 1 and clkevent)then

                                                                                                    if((v_en=1) and (v_en_reg=0))then

                                                                                                    addr_slt=(others =gt0)

                                                                                                    end if

                                                                                                    if(v_en=1 and h_en=1)then

                                                                                                    if(line_countlt=514)then

                                                                                                    if(c_hs lt= 655)then

                                                                                                    rd lt= 1

                                                                                                    red lt= data_in

                                                                                                    green lt= data_in

                                                                                                    blue lt= data_in

                                                                                                    addr_s lt=addr_s +000000000000000001

                                                                                                    else

                                                                                                    rd lt= 0

                                                                                                    red lt= 0

                                                                                                    green lt= 1

                                                                                                    blue lt= 0

                                                                                                    end if

                                                                                                    else

                                                                                                    rd lt= 0

                                                                                                    red lt= 1

                                                                                                    green lt= 1

                                                                                                    blue lt= 1

                                                                                                    51

                                                                                                    end if

                                                                                                    if(addr_s=111011111111111111)then

                                                                                                    addr_slt= (others =gt0)

                                                                                                    end if

                                                                                                    else

                                                                                                    red lt= 0

                                                                                                    green lt= 0

                                                                                                    blue lt= 0

                                                                                                    end if

                                                                                                    end if

                                                                                                    -- end if

                                                                                                    --end if

                                                                                                    end process

                                                                                                    end behv

                                                                                                    52

                                                                                                    12 FPGA ARCHITECTURE

                                                                                                    A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                                    be configured by the customer or designer after manufacturingmdashhence the

                                                                                                    name field-programmable FPGAs are programmed using a logic circuit

                                                                                                    diagram or a source code in a hardware description language (HDL) to specify

                                                                                                    how the chip will work They can be used to implement any logical function

                                                                                                    that an application-specific integrated circuit (ASIC) could perform but the

                                                                                                    ability to update the functionality after shipping offers advantages for many

                                                                                                    applications

                                                                                                    FPGAs contain programmable logic components called logic blocks and a

                                                                                                    hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                                    togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                                    can be configured to perform complex combinational functions or merely

                                                                                                    simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                                    include memory elements which may be simple flip-flops or more complete

                                                                                                    blocks of memory

                                                                                                    History The FPGA industry sprouted from programmable read only

                                                                                                    memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                                    both had the option of being programmed in batches in a factory or in the field

                                                                                                    (field programmable) however programmable logic was hard-wired between

                                                                                                    logic gates

                                                                                                    Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                                    first commercially viable field programmable gate array in 1985 ndash the

                                                                                                    XC2064 The XC2064 had programmable gates and programmable

                                                                                                    interconnects between gates the beginnings of a new technology and market

                                                                                                    The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                                    input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                                    into the National Inventors Hall of Fame for his invention

                                                                                                    53

                                                                                                    Some of the industryrsquos foundational concepts and technologies for

                                                                                                    programmable logic arrays gates and logic blocks are founded in patents

                                                                                                    awarded to David W Page and LuVerne R Peterson in 1985

                                                                                                    In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                                    proposed by Steve Casselman to develop a computer that would implement

                                                                                                    600000 reprogrammable gates Casselman was successful and the system was

                                                                                                    awarded a patent in 1992

                                                                                                    Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                                    1990s when competitors sprouted up eroding significant market-share By

                                                                                                    1993 Actel was serving about 18 percent of the market

                                                                                                    The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                                    and the volume of production In the early 1990s FPGAs were primarily used

                                                                                                    in telecommunications and networking By the end of the decade FPGAs

                                                                                                    found their way into consumer automotive and industrial applications

                                                                                                    FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                                    algorithm technology and FPGAs to create a sound recognition device

                                                                                                    Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                                    to decide the configuration needed to accomplish a sound recognition task

                                                                                                    121 Modern developments

                                                                                                    A recent trend has been to take the coarse-grained architectural

                                                                                                    approach a step further by combining the logic blocks and interconnects of

                                                                                                    traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                                    form a complete system on a programmable chip This work mirrors the

                                                                                                    architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                                    Group which combined a reconfigurable CPU architecture on a single chip

                                                                                                    called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                                    technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                                    which include one or more PowerPC processors embedded within the FPGAs

                                                                                                    54

                                                                                                    logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                                    processor in combination with Atmels programmable logic architecture

                                                                                                    An alternate approach to using hard-macro processors is to make use of soft

                                                                                                    processor cores that are implemented within the FPGA logic (See Soft

                                                                                                    processors below)

                                                                                                    As previously mentioned many modern FPGAs have the ability to be

                                                                                                    reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                                    computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                                    suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                                    example of a reconfigurable soft processor implemented on FPGAs However

                                                                                                    it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                                    to a specific program

                                                                                                    Additionally new non-FPGA architectures are beginning to emerge

                                                                                                    Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                                    hybrid approach by providing an array of processor cores and FPGA-like

                                                                                                    programmable cores on the same chip

                                                                                                    Gates

                                                                                                    1987 9000 gates Xilinx

                                                                                                    1992 600000 Naval Surface Warfare Department

                                                                                                    Early 2000s Millions

                                                                                                    Market size

                                                                                                    1985 First commercial FPGA technology invented by Xilinx

                                                                                                    1987 $14 million

                                                                                                    ~1993 gt$385 million

                                                                                                    2005 $19 billion

                                                                                                    2010 estimates $275 billion

                                                                                                    55

                                                                                                    122 FPGA Comparisons

                                                                                                    Historically FPGAs have been slower less energy efficient and generally

                                                                                                    achieved less functionality than their fixed ASIC counterparts A combination

                                                                                                    of volume fabrication improvements research and development and the IO

                                                                                                    capabilities of new supercomputers have largely closed the performance gap

                                                                                                    between ASICs and FPGAs

                                                                                                    Advantages include a shorter time to market ability to re-program in the field

                                                                                                    to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                                    middle road by developing their hardware on ordinary FPGAs but

                                                                                                    manufacture their final version so it can no longer be modified after the design

                                                                                                    has been committed

                                                                                                    Xilinx claims that several market and technology dynamics are changing the

                                                                                                    ASICFPGA paradigm

                                                                                                    IC costs are rising aggressively

                                                                                                    ASIC complexity has bolstered development time and costs

                                                                                                    RampD resources and headcount is decreasing

                                                                                                    Revenue losses for slow time-to-market are increasing

                                                                                                    Financial constraints in a poor economy are driving low-cost technologies

                                                                                                    These trends make FPGAs a better alternative than ASICs for a growing

                                                                                                    number of higher-volume applications than they have been historically used

                                                                                                    for which the company blames for the growing number of FPGA design starts

                                                                                                    (see History)

                                                                                                    The primary differences between CPLDs and FPGAs are architectural A

                                                                                                    CPLD has a somewhat restrictive structure consisting of one or more

                                                                                                    programmable sum-of-products logic arrays feeding a relatively small number

                                                                                                    of clocked registers The result of this is less flexibility with the advantage of

                                                                                                    more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                                    FPGA architectures on the other hand are dominated by interconnect This

                                                                                                    56

                                                                                                    makes them far more flexible (in terms of the range of designs that are

                                                                                                    practical for implementation within them) but also far more complex to design

                                                                                                    for

                                                                                                    Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                    FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                    and embedded memories as well as to have logic blocks implement decoders

                                                                                                    or mathematical functions

                                                                                                    Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                    portion of the device be re-programmed while other portions continue running

                                                                                                    123 Applications

                                                                                                    Applications of FPGAs include digital signal processing software-defined

                                                                                                    radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                    computer vision speech recognition cryptography bioinformatics computer

                                                                                                    hardware emulation radio astronomy and a growing range of other areas

                                                                                                    FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                    space that of glue logic for PCBs As their size capabilities and speed

                                                                                                    increased they began to take over larger and larger functions to the state where

                                                                                                    some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                    introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                    applications which had traditionally been the sole reserve of DSPs began to

                                                                                                    incorporate FPGAs instead

                                                                                                    FPGAs especially find applications in any area or algorithm that can make use

                                                                                                    of the massive parallelism offered by their architecture One such area is code

                                                                                                    breaking in particular brute-force attack of cryptographic algorithms

                                                                                                    FPGAs are increasingly used in conventional high performance computing

                                                                                                    applications where computational kernels such as FFT or Convolution are

                                                                                                    performed on the FPGA instead of a microprocessor

                                                                                                    57

                                                                                                    The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                    considerable computational throughput even at a low MHz clock rates The

                                                                                                    flexibility of the FPGA allows for even higher performance by trading off

                                                                                                    precision and range in the number format for an increased number of parallel

                                                                                                    arithmetic units This has driven a new type of processing called

                                                                                                    reconfigurable computing where time intensive tasks are offloaded from

                                                                                                    software to FPGAs

                                                                                                    The adoption of FPGAs in high performance computing is currently limited by

                                                                                                    the complexity of FPGA design compared to conventional software and the

                                                                                                    extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                    is necessary after even minor changes to the source code

                                                                                                    Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                    where the volume of production is small For these low-volume applications

                                                                                                    the premium that companies pay in hardware costs per unit for a

                                                                                                    programmable chip is more affordable than the development resources spent

                                                                                                    on creating an ASIC for a low-volume application Today new cost and

                                                                                                    performance dynamics have broadened the range of viable applications

                                                                                                    Architecture

                                                                                                    The most common FPGA architecture consists of an array of configurable

                                                                                                    logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                    channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                    into the height of one row or the width of one column in the array

                                                                                                    An application circuit must be mapped into an FPGA with adequate resources

                                                                                                    While the number of CLBs and IOs required is easily determined from the

                                                                                                    design the number of routing tracks needed may vary considerably even

                                                                                                    among designs with the same amount of logic (For example a crossbar switch

                                                                                                    requires much more routing than a systolic array with the same gate count)

                                                                                                    Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                    of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                    58

                                                                                                    just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                    can be routed This is determined by estimates such as those derived from

                                                                                                    Rents rule or by experiments with existing designs

                                                                                                    The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                    blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                    width of one column as shown below All the routing channels have the same

                                                                                                    width (number of wires)

                                                                                                    121 FPGA structure

                                                                                                    A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                    flip-flop as shown below In recent years manufacturers have started moving

                                                                                                    to 6-input LUTs in their high performance parts claiming increased

                                                                                                    performance

                                                                                                    Typical logic block

                                                                                                    59

                                                                                                    There is only one output which can be either the registered or the unregistered

                                                                                                    LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                    Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                    via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                    and other signals are separately managed

                                                                                                    For this example architecture the locations of the FPGA logic block pins are

                                                                                                    shown below

                                                                                                    Logic Block Pin Locations

                                                                                                    Each input is accessible from one side of the logic block while the output pin

                                                                                                    can connect to routing wires in both the channel to the right and the channel

                                                                                                    below the logic block

                                                                                                    Each logic block output pin can connect to any of the wiring segments in the

                                                                                                    channels adjacent to it

                                                                                                    Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                    channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                    connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                    channel immediately below it

                                                                                                    Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                    spans only one logic block before it terminates in a switch box By turning on

                                                                                                    60

                                                                                                    some of the programmable switches within a switch box longer paths can be

                                                                                                    constructed For higher speed interconnect some FPGA architectures use

                                                                                                    longer routing lines that span multiple logic blocks

                                                                                                    Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                    this architecture when a wire enters a switch box there are three

                                                                                                    programmable switches that allow it to connect to three other wires in adjacent

                                                                                                    channel segments The pattern or topology of switches used in this

                                                                                                    architecture is the planar or domain-based switch box topology In this switch

                                                                                                    box topology a wire in track number one connects only to wires in track

                                                                                                    number one in adjacent channel segments wires in track number 2 connect

                                                                                                    only to other wires in track number 2 and so on The figure below illustrates

                                                                                                    the connections in a switch box

                                                                                                    Switch box topology

                                                                                                    Modern FPGA families expand upon the above capabilities to include higher

                                                                                                    level functionality fixed into the silicon Having these common functions

                                                                                                    embedded into the silicon reduces the area required and gives those functions

                                                                                                    increased speed compared to building them from primitives Examples of these

                                                                                                    include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                    logic and embedded memories

                                                                                                    61

                                                                                                    FPGAs are also widely used for systems validation including pre-silicon

                                                                                                    validation post-silicon validation and firmware development This allows chip

                                                                                                    companies to validate their design before the chip is produced in the factory

                                                                                                    reducing the time to market

                                                                                                    124 FPGA Design and Programming

                                                                                                    To define the behavior of the FPGA the user provides a hardware description

                                                                                                    language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                    with when handling large structures because its possible to just specify them

                                                                                                    numerically rather than having to draw every piece by hand On the other hand

                                                                                                    schematic entry can allow for easier visualisation of a design

                                                                                                    Then using an electronic design automation tool a technology-mapped netlist

                                                                                                    is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                    using a process called place-and-route usually performed by the FPGA

                                                                                                    companys proprietary place-and-route software The user will validate the

                                                                                                    map place and route results via timing analysis simulation and other

                                                                                                    verification methodologies Once the design and validation process is

                                                                                                    complete the binary file generated (also using the FPGA companys

                                                                                                    proprietary software) is used to (re)configure the FPGA

                                                                                                    Going from schematicHDL source files to actual configuration The source

                                                                                                    files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                    different steps will produce a file This file is then transferred to the

                                                                                                    FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                    like an EEPROM

                                                                                                    The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                    reduce the complexity of designing in HDLs which have been compared to the

                                                                                                    equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                    through the introduction of alternative languages

                                                                                                    62

                                                                                                    To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                    predefined complex functions and circuits that have been tested and optimized

                                                                                                    to speed up the design process These predefined circuits are commonly called

                                                                                                    IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                    (rarely free and typically released under proprietary licenses) Other

                                                                                                    predefined circuits are available from developer communities such as Open

                                                                                                    Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                    other sources

                                                                                                    In a typical design flow an FPGA application developer will simulate the

                                                                                                    design at multiple stages throughout the design process Initially the RTL

                                                                                                    description in VHDL or Verilog is simulated by creating test benches to

                                                                                                    simulate the system and observe results Then after the synthesis engine has

                                                                                                    mapped the design to a netlist the netlist is translated to a gate level

                                                                                                    description where simulation is repeated to confirm the synthesis proceeded

                                                                                                    without errors Finally the design is laid out in the FPGA at which point

                                                                                                    propagation delays can be added and the simulation run again with these

                                                                                                    values back-annotated onto the netlist

                                                                                                    Basic Process Technology Types

                                                                                                    SRAM - based on static memory technology In-system programmable and re-

                                                                                                    programmable Requires external boot devices CMOS

                                                                                                    Antifuse - One-time programmable CMOS

                                                                                                    EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                    one-time programmable in production because of plastic packaging

                                                                                                    Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                    EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                    technology Can be erased even in plastic packages Some but not all

                                                                                                    EEPROM devices can be in-system programmed CMOS

                                                                                                    Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                    packages Some but not all flash devices can be in-system programmed

                                                                                                    63

                                                                                                    Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                    therefore less expensive to manufacture CMOS

                                                                                                    Fuse - One-time programmable Bipolar

                                                                                                    Major Manufacturers

                                                                                                    Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                    rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                    representing over 50 percent

                                                                                                    Xilinx also provides free Windows and Linux design software while Altera

                                                                                                    provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                    a rental scheme

                                                                                                    Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                    (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                    Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                    focused CSSP no general purpose FPGAs)

                                                                                                    125 FPGA prototype

                                                                                                    FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                    prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                    hardware verification and early software development

                                                                                                    Main stream verification methods for hardware design and early software and

                                                                                                    firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                    design on FPGA has become a good method to do this

                                                                                                    64

                                                                                                    Reasons why Prototyping is important

                                                                                                    1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                    functionally correct This is compared to designers only relying on software

                                                                                                    simulations to verify that their hardware design is sound Simulation speed and

                                                                                                    modeling accuracy limitations hinder this development

                                                                                                    2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                    the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                    time in area of software development and testing at the software-hardware

                                                                                                    integration stage This allows many unforeseen software bugs that appear due

                                                                                                    to todays array of operating systems applications and hardware

                                                                                                    3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                    system work well together off the simulation stage and in actual form

                                                                                                    4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                    bringing in interest early This speeds up the overall development cycle and

                                                                                                    allows for more enhancement or improvement to the chip features as it would

                                                                                                    otherwise have been

                                                                                                    65

                                                                                                    13 SIMULATION RESULTS

                                                                                                    66

                                                                                                    GATE LEVEL

                                                                                                    FigVGA CONTROLLER

                                                                                                    67

                                                                                                    GATE LEVEL

                                                                                                    Fig SYNC UNIT

                                                                                                    68

                                                                                                    TECHNOLOGY SCHEMATIC

                                                                                                    69

                                                                                                    14 APPLICATIONS

                                                                                                    Motion Tracking Systems

                                                                                                    Image processing systems

                                                                                                    Displaying Systems

                                                                                                    141 CONCLUTION

                                                                                                    Hence by implementing FPGA taking a processor-generated picture

                                                                                                    (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                    each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                    to correctly display the picture on a connected monitor

                                                                                                    70

                                                                                                    REFERENCES

                                                                                                    1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                    2 A VGA display controller by Eduardo Sanchez

                                                                                                    3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                    71

                                                                                                    • Pixel
                                                                                                      • 6 SIGNALS IN VGA CONTROLLER
                                                                                                      • Gates
                                                                                                      • Market size
                                                                                                      • 122 FPGA Comparisons

                                                                                                      if(rstn = 0)then

                                                                                                      rd lt= 0

                                                                                                      red lt= 0

                                                                                                      green lt= 0

                                                                                                      blue lt= 0

                                                                                                      addr_s lt= x0000 amp 00

                                                                                                      elsif (clk = 1 and clkevent)then

                                                                                                      if((v_en=1) and (v_en_reg=0))then

                                                                                                      addr_slt=(others =gt0)

                                                                                                      end if

                                                                                                      if(v_en=1 and h_en=1)then

                                                                                                      if(line_countlt=514)then

                                                                                                      if(c_hs lt= 655)then

                                                                                                      rd lt= 1

                                                                                                      red lt= data_in

                                                                                                      green lt= data_in

                                                                                                      blue lt= data_in

                                                                                                      addr_s lt=addr_s +000000000000000001

                                                                                                      else

                                                                                                      rd lt= 0

                                                                                                      red lt= 0

                                                                                                      green lt= 1

                                                                                                      blue lt= 0

                                                                                                      end if

                                                                                                      else

                                                                                                      rd lt= 0

                                                                                                      red lt= 1

                                                                                                      green lt= 1

                                                                                                      blue lt= 1

                                                                                                      51

                                                                                                      end if

                                                                                                      if(addr_s=111011111111111111)then

                                                                                                      addr_slt= (others =gt0)

                                                                                                      end if

                                                                                                      else

                                                                                                      red lt= 0

                                                                                                      green lt= 0

                                                                                                      blue lt= 0

                                                                                                      end if

                                                                                                      end if

                                                                                                      -- end if

                                                                                                      --end if

                                                                                                      end process

                                                                                                      end behv

                                                                                                      52

                                                                                                      12 FPGA ARCHITECTURE

                                                                                                      A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                                      be configured by the customer or designer after manufacturingmdashhence the

                                                                                                      name field-programmable FPGAs are programmed using a logic circuit

                                                                                                      diagram or a source code in a hardware description language (HDL) to specify

                                                                                                      how the chip will work They can be used to implement any logical function

                                                                                                      that an application-specific integrated circuit (ASIC) could perform but the

                                                                                                      ability to update the functionality after shipping offers advantages for many

                                                                                                      applications

                                                                                                      FPGAs contain programmable logic components called logic blocks and a

                                                                                                      hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                                      togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                                      can be configured to perform complex combinational functions or merely

                                                                                                      simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                                      include memory elements which may be simple flip-flops or more complete

                                                                                                      blocks of memory

                                                                                                      History The FPGA industry sprouted from programmable read only

                                                                                                      memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                                      both had the option of being programmed in batches in a factory or in the field

                                                                                                      (field programmable) however programmable logic was hard-wired between

                                                                                                      logic gates

                                                                                                      Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                                      first commercially viable field programmable gate array in 1985 ndash the

                                                                                                      XC2064 The XC2064 had programmable gates and programmable

                                                                                                      interconnects between gates the beginnings of a new technology and market

                                                                                                      The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                                      input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                                      into the National Inventors Hall of Fame for his invention

                                                                                                      53

                                                                                                      Some of the industryrsquos foundational concepts and technologies for

                                                                                                      programmable logic arrays gates and logic blocks are founded in patents

                                                                                                      awarded to David W Page and LuVerne R Peterson in 1985

                                                                                                      In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                                      proposed by Steve Casselman to develop a computer that would implement

                                                                                                      600000 reprogrammable gates Casselman was successful and the system was

                                                                                                      awarded a patent in 1992

                                                                                                      Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                                      1990s when competitors sprouted up eroding significant market-share By

                                                                                                      1993 Actel was serving about 18 percent of the market

                                                                                                      The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                                      and the volume of production In the early 1990s FPGAs were primarily used

                                                                                                      in telecommunications and networking By the end of the decade FPGAs

                                                                                                      found their way into consumer automotive and industrial applications

                                                                                                      FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                                      algorithm technology and FPGAs to create a sound recognition device

                                                                                                      Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                                      to decide the configuration needed to accomplish a sound recognition task

                                                                                                      121 Modern developments

                                                                                                      A recent trend has been to take the coarse-grained architectural

                                                                                                      approach a step further by combining the logic blocks and interconnects of

                                                                                                      traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                                      form a complete system on a programmable chip This work mirrors the

                                                                                                      architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                                      Group which combined a reconfigurable CPU architecture on a single chip

                                                                                                      called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                                      technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                                      which include one or more PowerPC processors embedded within the FPGAs

                                                                                                      54

                                                                                                      logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                                      processor in combination with Atmels programmable logic architecture

                                                                                                      An alternate approach to using hard-macro processors is to make use of soft

                                                                                                      processor cores that are implemented within the FPGA logic (See Soft

                                                                                                      processors below)

                                                                                                      As previously mentioned many modern FPGAs have the ability to be

                                                                                                      reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                                      computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                                      suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                                      example of a reconfigurable soft processor implemented on FPGAs However

                                                                                                      it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                                      to a specific program

                                                                                                      Additionally new non-FPGA architectures are beginning to emerge

                                                                                                      Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                                      hybrid approach by providing an array of processor cores and FPGA-like

                                                                                                      programmable cores on the same chip

                                                                                                      Gates

                                                                                                      1987 9000 gates Xilinx

                                                                                                      1992 600000 Naval Surface Warfare Department

                                                                                                      Early 2000s Millions

                                                                                                      Market size

                                                                                                      1985 First commercial FPGA technology invented by Xilinx

                                                                                                      1987 $14 million

                                                                                                      ~1993 gt$385 million

                                                                                                      2005 $19 billion

                                                                                                      2010 estimates $275 billion

                                                                                                      55

                                                                                                      122 FPGA Comparisons

                                                                                                      Historically FPGAs have been slower less energy efficient and generally

                                                                                                      achieved less functionality than their fixed ASIC counterparts A combination

                                                                                                      of volume fabrication improvements research and development and the IO

                                                                                                      capabilities of new supercomputers have largely closed the performance gap

                                                                                                      between ASICs and FPGAs

                                                                                                      Advantages include a shorter time to market ability to re-program in the field

                                                                                                      to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                                      middle road by developing their hardware on ordinary FPGAs but

                                                                                                      manufacture their final version so it can no longer be modified after the design

                                                                                                      has been committed

                                                                                                      Xilinx claims that several market and technology dynamics are changing the

                                                                                                      ASICFPGA paradigm

                                                                                                      IC costs are rising aggressively

                                                                                                      ASIC complexity has bolstered development time and costs

                                                                                                      RampD resources and headcount is decreasing

                                                                                                      Revenue losses for slow time-to-market are increasing

                                                                                                      Financial constraints in a poor economy are driving low-cost technologies

                                                                                                      These trends make FPGAs a better alternative than ASICs for a growing

                                                                                                      number of higher-volume applications than they have been historically used

                                                                                                      for which the company blames for the growing number of FPGA design starts

                                                                                                      (see History)

                                                                                                      The primary differences between CPLDs and FPGAs are architectural A

                                                                                                      CPLD has a somewhat restrictive structure consisting of one or more

                                                                                                      programmable sum-of-products logic arrays feeding a relatively small number

                                                                                                      of clocked registers The result of this is less flexibility with the advantage of

                                                                                                      more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                                      FPGA architectures on the other hand are dominated by interconnect This

                                                                                                      56

                                                                                                      makes them far more flexible (in terms of the range of designs that are

                                                                                                      practical for implementation within them) but also far more complex to design

                                                                                                      for

                                                                                                      Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                      FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                      and embedded memories as well as to have logic blocks implement decoders

                                                                                                      or mathematical functions

                                                                                                      Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                      portion of the device be re-programmed while other portions continue running

                                                                                                      123 Applications

                                                                                                      Applications of FPGAs include digital signal processing software-defined

                                                                                                      radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                      computer vision speech recognition cryptography bioinformatics computer

                                                                                                      hardware emulation radio astronomy and a growing range of other areas

                                                                                                      FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                      space that of glue logic for PCBs As their size capabilities and speed

                                                                                                      increased they began to take over larger and larger functions to the state where

                                                                                                      some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                      introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                      applications which had traditionally been the sole reserve of DSPs began to

                                                                                                      incorporate FPGAs instead

                                                                                                      FPGAs especially find applications in any area or algorithm that can make use

                                                                                                      of the massive parallelism offered by their architecture One such area is code

                                                                                                      breaking in particular brute-force attack of cryptographic algorithms

                                                                                                      FPGAs are increasingly used in conventional high performance computing

                                                                                                      applications where computational kernels such as FFT or Convolution are

                                                                                                      performed on the FPGA instead of a microprocessor

                                                                                                      57

                                                                                                      The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                      considerable computational throughput even at a low MHz clock rates The

                                                                                                      flexibility of the FPGA allows for even higher performance by trading off

                                                                                                      precision and range in the number format for an increased number of parallel

                                                                                                      arithmetic units This has driven a new type of processing called

                                                                                                      reconfigurable computing where time intensive tasks are offloaded from

                                                                                                      software to FPGAs

                                                                                                      The adoption of FPGAs in high performance computing is currently limited by

                                                                                                      the complexity of FPGA design compared to conventional software and the

                                                                                                      extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                      is necessary after even minor changes to the source code

                                                                                                      Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                      where the volume of production is small For these low-volume applications

                                                                                                      the premium that companies pay in hardware costs per unit for a

                                                                                                      programmable chip is more affordable than the development resources spent

                                                                                                      on creating an ASIC for a low-volume application Today new cost and

                                                                                                      performance dynamics have broadened the range of viable applications

                                                                                                      Architecture

                                                                                                      The most common FPGA architecture consists of an array of configurable

                                                                                                      logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                      channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                      into the height of one row or the width of one column in the array

                                                                                                      An application circuit must be mapped into an FPGA with adequate resources

                                                                                                      While the number of CLBs and IOs required is easily determined from the

                                                                                                      design the number of routing tracks needed may vary considerably even

                                                                                                      among designs with the same amount of logic (For example a crossbar switch

                                                                                                      requires much more routing than a systolic array with the same gate count)

                                                                                                      Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                      of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                      58

                                                                                                      just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                      can be routed This is determined by estimates such as those derived from

                                                                                                      Rents rule or by experiments with existing designs

                                                                                                      The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                      blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                      width of one column as shown below All the routing channels have the same

                                                                                                      width (number of wires)

                                                                                                      121 FPGA structure

                                                                                                      A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                      flip-flop as shown below In recent years manufacturers have started moving

                                                                                                      to 6-input LUTs in their high performance parts claiming increased

                                                                                                      performance

                                                                                                      Typical logic block

                                                                                                      59

                                                                                                      There is only one output which can be either the registered or the unregistered

                                                                                                      LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                      Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                      via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                      and other signals are separately managed

                                                                                                      For this example architecture the locations of the FPGA logic block pins are

                                                                                                      shown below

                                                                                                      Logic Block Pin Locations

                                                                                                      Each input is accessible from one side of the logic block while the output pin

                                                                                                      can connect to routing wires in both the channel to the right and the channel

                                                                                                      below the logic block

                                                                                                      Each logic block output pin can connect to any of the wiring segments in the

                                                                                                      channels adjacent to it

                                                                                                      Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                      channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                      connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                      channel immediately below it

                                                                                                      Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                      spans only one logic block before it terminates in a switch box By turning on

                                                                                                      60

                                                                                                      some of the programmable switches within a switch box longer paths can be

                                                                                                      constructed For higher speed interconnect some FPGA architectures use

                                                                                                      longer routing lines that span multiple logic blocks

                                                                                                      Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                      this architecture when a wire enters a switch box there are three

                                                                                                      programmable switches that allow it to connect to three other wires in adjacent

                                                                                                      channel segments The pattern or topology of switches used in this

                                                                                                      architecture is the planar or domain-based switch box topology In this switch

                                                                                                      box topology a wire in track number one connects only to wires in track

                                                                                                      number one in adjacent channel segments wires in track number 2 connect

                                                                                                      only to other wires in track number 2 and so on The figure below illustrates

                                                                                                      the connections in a switch box

                                                                                                      Switch box topology

                                                                                                      Modern FPGA families expand upon the above capabilities to include higher

                                                                                                      level functionality fixed into the silicon Having these common functions

                                                                                                      embedded into the silicon reduces the area required and gives those functions

                                                                                                      increased speed compared to building them from primitives Examples of these

                                                                                                      include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                      logic and embedded memories

                                                                                                      61

                                                                                                      FPGAs are also widely used for systems validation including pre-silicon

                                                                                                      validation post-silicon validation and firmware development This allows chip

                                                                                                      companies to validate their design before the chip is produced in the factory

                                                                                                      reducing the time to market

                                                                                                      124 FPGA Design and Programming

                                                                                                      To define the behavior of the FPGA the user provides a hardware description

                                                                                                      language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                      with when handling large structures because its possible to just specify them

                                                                                                      numerically rather than having to draw every piece by hand On the other hand

                                                                                                      schematic entry can allow for easier visualisation of a design

                                                                                                      Then using an electronic design automation tool a technology-mapped netlist

                                                                                                      is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                      using a process called place-and-route usually performed by the FPGA

                                                                                                      companys proprietary place-and-route software The user will validate the

                                                                                                      map place and route results via timing analysis simulation and other

                                                                                                      verification methodologies Once the design and validation process is

                                                                                                      complete the binary file generated (also using the FPGA companys

                                                                                                      proprietary software) is used to (re)configure the FPGA

                                                                                                      Going from schematicHDL source files to actual configuration The source

                                                                                                      files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                      different steps will produce a file This file is then transferred to the

                                                                                                      FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                      like an EEPROM

                                                                                                      The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                      reduce the complexity of designing in HDLs which have been compared to the

                                                                                                      equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                      through the introduction of alternative languages

                                                                                                      62

                                                                                                      To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                      predefined complex functions and circuits that have been tested and optimized

                                                                                                      to speed up the design process These predefined circuits are commonly called

                                                                                                      IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                      (rarely free and typically released under proprietary licenses) Other

                                                                                                      predefined circuits are available from developer communities such as Open

                                                                                                      Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                      other sources

                                                                                                      In a typical design flow an FPGA application developer will simulate the

                                                                                                      design at multiple stages throughout the design process Initially the RTL

                                                                                                      description in VHDL or Verilog is simulated by creating test benches to

                                                                                                      simulate the system and observe results Then after the synthesis engine has

                                                                                                      mapped the design to a netlist the netlist is translated to a gate level

                                                                                                      description where simulation is repeated to confirm the synthesis proceeded

                                                                                                      without errors Finally the design is laid out in the FPGA at which point

                                                                                                      propagation delays can be added and the simulation run again with these

                                                                                                      values back-annotated onto the netlist

                                                                                                      Basic Process Technology Types

                                                                                                      SRAM - based on static memory technology In-system programmable and re-

                                                                                                      programmable Requires external boot devices CMOS

                                                                                                      Antifuse - One-time programmable CMOS

                                                                                                      EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                      one-time programmable in production because of plastic packaging

                                                                                                      Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                      EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                      technology Can be erased even in plastic packages Some but not all

                                                                                                      EEPROM devices can be in-system programmed CMOS

                                                                                                      Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                      packages Some but not all flash devices can be in-system programmed

                                                                                                      63

                                                                                                      Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                      therefore less expensive to manufacture CMOS

                                                                                                      Fuse - One-time programmable Bipolar

                                                                                                      Major Manufacturers

                                                                                                      Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                      rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                      representing over 50 percent

                                                                                                      Xilinx also provides free Windows and Linux design software while Altera

                                                                                                      provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                      a rental scheme

                                                                                                      Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                      (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                      Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                      focused CSSP no general purpose FPGAs)

                                                                                                      125 FPGA prototype

                                                                                                      FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                      prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                      hardware verification and early software development

                                                                                                      Main stream verification methods for hardware design and early software and

                                                                                                      firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                      design on FPGA has become a good method to do this

                                                                                                      64

                                                                                                      Reasons why Prototyping is important

                                                                                                      1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                      functionally correct This is compared to designers only relying on software

                                                                                                      simulations to verify that their hardware design is sound Simulation speed and

                                                                                                      modeling accuracy limitations hinder this development

                                                                                                      2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                      the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                      time in area of software development and testing at the software-hardware

                                                                                                      integration stage This allows many unforeseen software bugs that appear due

                                                                                                      to todays array of operating systems applications and hardware

                                                                                                      3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                      system work well together off the simulation stage and in actual form

                                                                                                      4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                      bringing in interest early This speeds up the overall development cycle and

                                                                                                      allows for more enhancement or improvement to the chip features as it would

                                                                                                      otherwise have been

                                                                                                      65

                                                                                                      13 SIMULATION RESULTS

                                                                                                      66

                                                                                                      GATE LEVEL

                                                                                                      FigVGA CONTROLLER

                                                                                                      67

                                                                                                      GATE LEVEL

                                                                                                      Fig SYNC UNIT

                                                                                                      68

                                                                                                      TECHNOLOGY SCHEMATIC

                                                                                                      69

                                                                                                      14 APPLICATIONS

                                                                                                      Motion Tracking Systems

                                                                                                      Image processing systems

                                                                                                      Displaying Systems

                                                                                                      141 CONCLUTION

                                                                                                      Hence by implementing FPGA taking a processor-generated picture

                                                                                                      (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                      each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                      to correctly display the picture on a connected monitor

                                                                                                      70

                                                                                                      REFERENCES

                                                                                                      1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                      2 A VGA display controller by Eduardo Sanchez

                                                                                                      3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                      71

                                                                                                      • Pixel
                                                                                                        • 6 SIGNALS IN VGA CONTROLLER
                                                                                                        • Gates
                                                                                                        • Market size
                                                                                                        • 122 FPGA Comparisons

                                                                                                        end if

                                                                                                        if(addr_s=111011111111111111)then

                                                                                                        addr_slt= (others =gt0)

                                                                                                        end if

                                                                                                        else

                                                                                                        red lt= 0

                                                                                                        green lt= 0

                                                                                                        blue lt= 0

                                                                                                        end if

                                                                                                        end if

                                                                                                        -- end if

                                                                                                        --end if

                                                                                                        end process

                                                                                                        end behv

                                                                                                        52

                                                                                                        12 FPGA ARCHITECTURE

                                                                                                        A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                                        be configured by the customer or designer after manufacturingmdashhence the

                                                                                                        name field-programmable FPGAs are programmed using a logic circuit

                                                                                                        diagram or a source code in a hardware description language (HDL) to specify

                                                                                                        how the chip will work They can be used to implement any logical function

                                                                                                        that an application-specific integrated circuit (ASIC) could perform but the

                                                                                                        ability to update the functionality after shipping offers advantages for many

                                                                                                        applications

                                                                                                        FPGAs contain programmable logic components called logic blocks and a

                                                                                                        hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                                        togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                                        can be configured to perform complex combinational functions or merely

                                                                                                        simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                                        include memory elements which may be simple flip-flops or more complete

                                                                                                        blocks of memory

                                                                                                        History The FPGA industry sprouted from programmable read only

                                                                                                        memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                                        both had the option of being programmed in batches in a factory or in the field

                                                                                                        (field programmable) however programmable logic was hard-wired between

                                                                                                        logic gates

                                                                                                        Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                                        first commercially viable field programmable gate array in 1985 ndash the

                                                                                                        XC2064 The XC2064 had programmable gates and programmable

                                                                                                        interconnects between gates the beginnings of a new technology and market

                                                                                                        The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                                        input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                                        into the National Inventors Hall of Fame for his invention

                                                                                                        53

                                                                                                        Some of the industryrsquos foundational concepts and technologies for

                                                                                                        programmable logic arrays gates and logic blocks are founded in patents

                                                                                                        awarded to David W Page and LuVerne R Peterson in 1985

                                                                                                        In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                                        proposed by Steve Casselman to develop a computer that would implement

                                                                                                        600000 reprogrammable gates Casselman was successful and the system was

                                                                                                        awarded a patent in 1992

                                                                                                        Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                                        1990s when competitors sprouted up eroding significant market-share By

                                                                                                        1993 Actel was serving about 18 percent of the market

                                                                                                        The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                                        and the volume of production In the early 1990s FPGAs were primarily used

                                                                                                        in telecommunications and networking By the end of the decade FPGAs

                                                                                                        found their way into consumer automotive and industrial applications

                                                                                                        FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                                        algorithm technology and FPGAs to create a sound recognition device

                                                                                                        Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                                        to decide the configuration needed to accomplish a sound recognition task

                                                                                                        121 Modern developments

                                                                                                        A recent trend has been to take the coarse-grained architectural

                                                                                                        approach a step further by combining the logic blocks and interconnects of

                                                                                                        traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                                        form a complete system on a programmable chip This work mirrors the

                                                                                                        architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                                        Group which combined a reconfigurable CPU architecture on a single chip

                                                                                                        called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                                        technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                                        which include one or more PowerPC processors embedded within the FPGAs

                                                                                                        54

                                                                                                        logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                                        processor in combination with Atmels programmable logic architecture

                                                                                                        An alternate approach to using hard-macro processors is to make use of soft

                                                                                                        processor cores that are implemented within the FPGA logic (See Soft

                                                                                                        processors below)

                                                                                                        As previously mentioned many modern FPGAs have the ability to be

                                                                                                        reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                                        computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                                        suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                                        example of a reconfigurable soft processor implemented on FPGAs However

                                                                                                        it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                                        to a specific program

                                                                                                        Additionally new non-FPGA architectures are beginning to emerge

                                                                                                        Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                                        hybrid approach by providing an array of processor cores and FPGA-like

                                                                                                        programmable cores on the same chip

                                                                                                        Gates

                                                                                                        1987 9000 gates Xilinx

                                                                                                        1992 600000 Naval Surface Warfare Department

                                                                                                        Early 2000s Millions

                                                                                                        Market size

                                                                                                        1985 First commercial FPGA technology invented by Xilinx

                                                                                                        1987 $14 million

                                                                                                        ~1993 gt$385 million

                                                                                                        2005 $19 billion

                                                                                                        2010 estimates $275 billion

                                                                                                        55

                                                                                                        122 FPGA Comparisons

                                                                                                        Historically FPGAs have been slower less energy efficient and generally

                                                                                                        achieved less functionality than their fixed ASIC counterparts A combination

                                                                                                        of volume fabrication improvements research and development and the IO

                                                                                                        capabilities of new supercomputers have largely closed the performance gap

                                                                                                        between ASICs and FPGAs

                                                                                                        Advantages include a shorter time to market ability to re-program in the field

                                                                                                        to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                                        middle road by developing their hardware on ordinary FPGAs but

                                                                                                        manufacture their final version so it can no longer be modified after the design

                                                                                                        has been committed

                                                                                                        Xilinx claims that several market and technology dynamics are changing the

                                                                                                        ASICFPGA paradigm

                                                                                                        IC costs are rising aggressively

                                                                                                        ASIC complexity has bolstered development time and costs

                                                                                                        RampD resources and headcount is decreasing

                                                                                                        Revenue losses for slow time-to-market are increasing

                                                                                                        Financial constraints in a poor economy are driving low-cost technologies

                                                                                                        These trends make FPGAs a better alternative than ASICs for a growing

                                                                                                        number of higher-volume applications than they have been historically used

                                                                                                        for which the company blames for the growing number of FPGA design starts

                                                                                                        (see History)

                                                                                                        The primary differences between CPLDs and FPGAs are architectural A

                                                                                                        CPLD has a somewhat restrictive structure consisting of one or more

                                                                                                        programmable sum-of-products logic arrays feeding a relatively small number

                                                                                                        of clocked registers The result of this is less flexibility with the advantage of

                                                                                                        more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                                        FPGA architectures on the other hand are dominated by interconnect This

                                                                                                        56

                                                                                                        makes them far more flexible (in terms of the range of designs that are

                                                                                                        practical for implementation within them) but also far more complex to design

                                                                                                        for

                                                                                                        Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                        FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                        and embedded memories as well as to have logic blocks implement decoders

                                                                                                        or mathematical functions

                                                                                                        Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                        portion of the device be re-programmed while other portions continue running

                                                                                                        123 Applications

                                                                                                        Applications of FPGAs include digital signal processing software-defined

                                                                                                        radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                        computer vision speech recognition cryptography bioinformatics computer

                                                                                                        hardware emulation radio astronomy and a growing range of other areas

                                                                                                        FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                        space that of glue logic for PCBs As their size capabilities and speed

                                                                                                        increased they began to take over larger and larger functions to the state where

                                                                                                        some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                        introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                        applications which had traditionally been the sole reserve of DSPs began to

                                                                                                        incorporate FPGAs instead

                                                                                                        FPGAs especially find applications in any area or algorithm that can make use

                                                                                                        of the massive parallelism offered by their architecture One such area is code

                                                                                                        breaking in particular brute-force attack of cryptographic algorithms

                                                                                                        FPGAs are increasingly used in conventional high performance computing

                                                                                                        applications where computational kernels such as FFT or Convolution are

                                                                                                        performed on the FPGA instead of a microprocessor

                                                                                                        57

                                                                                                        The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                        considerable computational throughput even at a low MHz clock rates The

                                                                                                        flexibility of the FPGA allows for even higher performance by trading off

                                                                                                        precision and range in the number format for an increased number of parallel

                                                                                                        arithmetic units This has driven a new type of processing called

                                                                                                        reconfigurable computing where time intensive tasks are offloaded from

                                                                                                        software to FPGAs

                                                                                                        The adoption of FPGAs in high performance computing is currently limited by

                                                                                                        the complexity of FPGA design compared to conventional software and the

                                                                                                        extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                        is necessary after even minor changes to the source code

                                                                                                        Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                        where the volume of production is small For these low-volume applications

                                                                                                        the premium that companies pay in hardware costs per unit for a

                                                                                                        programmable chip is more affordable than the development resources spent

                                                                                                        on creating an ASIC for a low-volume application Today new cost and

                                                                                                        performance dynamics have broadened the range of viable applications

                                                                                                        Architecture

                                                                                                        The most common FPGA architecture consists of an array of configurable

                                                                                                        logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                        channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                        into the height of one row or the width of one column in the array

                                                                                                        An application circuit must be mapped into an FPGA with adequate resources

                                                                                                        While the number of CLBs and IOs required is easily determined from the

                                                                                                        design the number of routing tracks needed may vary considerably even

                                                                                                        among designs with the same amount of logic (For example a crossbar switch

                                                                                                        requires much more routing than a systolic array with the same gate count)

                                                                                                        Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                        of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                        58

                                                                                                        just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                        can be routed This is determined by estimates such as those derived from

                                                                                                        Rents rule or by experiments with existing designs

                                                                                                        The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                        blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                        width of one column as shown below All the routing channels have the same

                                                                                                        width (number of wires)

                                                                                                        121 FPGA structure

                                                                                                        A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                        flip-flop as shown below In recent years manufacturers have started moving

                                                                                                        to 6-input LUTs in their high performance parts claiming increased

                                                                                                        performance

                                                                                                        Typical logic block

                                                                                                        59

                                                                                                        There is only one output which can be either the registered or the unregistered

                                                                                                        LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                        Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                        via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                        and other signals are separately managed

                                                                                                        For this example architecture the locations of the FPGA logic block pins are

                                                                                                        shown below

                                                                                                        Logic Block Pin Locations

                                                                                                        Each input is accessible from one side of the logic block while the output pin

                                                                                                        can connect to routing wires in both the channel to the right and the channel

                                                                                                        below the logic block

                                                                                                        Each logic block output pin can connect to any of the wiring segments in the

                                                                                                        channels adjacent to it

                                                                                                        Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                        channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                        connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                        channel immediately below it

                                                                                                        Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                        spans only one logic block before it terminates in a switch box By turning on

                                                                                                        60

                                                                                                        some of the programmable switches within a switch box longer paths can be

                                                                                                        constructed For higher speed interconnect some FPGA architectures use

                                                                                                        longer routing lines that span multiple logic blocks

                                                                                                        Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                        this architecture when a wire enters a switch box there are three

                                                                                                        programmable switches that allow it to connect to three other wires in adjacent

                                                                                                        channel segments The pattern or topology of switches used in this

                                                                                                        architecture is the planar or domain-based switch box topology In this switch

                                                                                                        box topology a wire in track number one connects only to wires in track

                                                                                                        number one in adjacent channel segments wires in track number 2 connect

                                                                                                        only to other wires in track number 2 and so on The figure below illustrates

                                                                                                        the connections in a switch box

                                                                                                        Switch box topology

                                                                                                        Modern FPGA families expand upon the above capabilities to include higher

                                                                                                        level functionality fixed into the silicon Having these common functions

                                                                                                        embedded into the silicon reduces the area required and gives those functions

                                                                                                        increased speed compared to building them from primitives Examples of these

                                                                                                        include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                        logic and embedded memories

                                                                                                        61

                                                                                                        FPGAs are also widely used for systems validation including pre-silicon

                                                                                                        validation post-silicon validation and firmware development This allows chip

                                                                                                        companies to validate their design before the chip is produced in the factory

                                                                                                        reducing the time to market

                                                                                                        124 FPGA Design and Programming

                                                                                                        To define the behavior of the FPGA the user provides a hardware description

                                                                                                        language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                        with when handling large structures because its possible to just specify them

                                                                                                        numerically rather than having to draw every piece by hand On the other hand

                                                                                                        schematic entry can allow for easier visualisation of a design

                                                                                                        Then using an electronic design automation tool a technology-mapped netlist

                                                                                                        is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                        using a process called place-and-route usually performed by the FPGA

                                                                                                        companys proprietary place-and-route software The user will validate the

                                                                                                        map place and route results via timing analysis simulation and other

                                                                                                        verification methodologies Once the design and validation process is

                                                                                                        complete the binary file generated (also using the FPGA companys

                                                                                                        proprietary software) is used to (re)configure the FPGA

                                                                                                        Going from schematicHDL source files to actual configuration The source

                                                                                                        files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                        different steps will produce a file This file is then transferred to the

                                                                                                        FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                        like an EEPROM

                                                                                                        The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                        reduce the complexity of designing in HDLs which have been compared to the

                                                                                                        equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                        through the introduction of alternative languages

                                                                                                        62

                                                                                                        To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                        predefined complex functions and circuits that have been tested and optimized

                                                                                                        to speed up the design process These predefined circuits are commonly called

                                                                                                        IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                        (rarely free and typically released under proprietary licenses) Other

                                                                                                        predefined circuits are available from developer communities such as Open

                                                                                                        Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                        other sources

                                                                                                        In a typical design flow an FPGA application developer will simulate the

                                                                                                        design at multiple stages throughout the design process Initially the RTL

                                                                                                        description in VHDL or Verilog is simulated by creating test benches to

                                                                                                        simulate the system and observe results Then after the synthesis engine has

                                                                                                        mapped the design to a netlist the netlist is translated to a gate level

                                                                                                        description where simulation is repeated to confirm the synthesis proceeded

                                                                                                        without errors Finally the design is laid out in the FPGA at which point

                                                                                                        propagation delays can be added and the simulation run again with these

                                                                                                        values back-annotated onto the netlist

                                                                                                        Basic Process Technology Types

                                                                                                        SRAM - based on static memory technology In-system programmable and re-

                                                                                                        programmable Requires external boot devices CMOS

                                                                                                        Antifuse - One-time programmable CMOS

                                                                                                        EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                        one-time programmable in production because of plastic packaging

                                                                                                        Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                        EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                        technology Can be erased even in plastic packages Some but not all

                                                                                                        EEPROM devices can be in-system programmed CMOS

                                                                                                        Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                        packages Some but not all flash devices can be in-system programmed

                                                                                                        63

                                                                                                        Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                        therefore less expensive to manufacture CMOS

                                                                                                        Fuse - One-time programmable Bipolar

                                                                                                        Major Manufacturers

                                                                                                        Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                        rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                        representing over 50 percent

                                                                                                        Xilinx also provides free Windows and Linux design software while Altera

                                                                                                        provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                        a rental scheme

                                                                                                        Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                        (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                        Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                        focused CSSP no general purpose FPGAs)

                                                                                                        125 FPGA prototype

                                                                                                        FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                        prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                        hardware verification and early software development

                                                                                                        Main stream verification methods for hardware design and early software and

                                                                                                        firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                        design on FPGA has become a good method to do this

                                                                                                        64

                                                                                                        Reasons why Prototyping is important

                                                                                                        1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                        functionally correct This is compared to designers only relying on software

                                                                                                        simulations to verify that their hardware design is sound Simulation speed and

                                                                                                        modeling accuracy limitations hinder this development

                                                                                                        2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                        the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                        time in area of software development and testing at the software-hardware

                                                                                                        integration stage This allows many unforeseen software bugs that appear due

                                                                                                        to todays array of operating systems applications and hardware

                                                                                                        3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                        system work well together off the simulation stage and in actual form

                                                                                                        4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                        bringing in interest early This speeds up the overall development cycle and

                                                                                                        allows for more enhancement or improvement to the chip features as it would

                                                                                                        otherwise have been

                                                                                                        65

                                                                                                        13 SIMULATION RESULTS

                                                                                                        66

                                                                                                        GATE LEVEL

                                                                                                        FigVGA CONTROLLER

                                                                                                        67

                                                                                                        GATE LEVEL

                                                                                                        Fig SYNC UNIT

                                                                                                        68

                                                                                                        TECHNOLOGY SCHEMATIC

                                                                                                        69

                                                                                                        14 APPLICATIONS

                                                                                                        Motion Tracking Systems

                                                                                                        Image processing systems

                                                                                                        Displaying Systems

                                                                                                        141 CONCLUTION

                                                                                                        Hence by implementing FPGA taking a processor-generated picture

                                                                                                        (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                        each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                        to correctly display the picture on a connected monitor

                                                                                                        70

                                                                                                        REFERENCES

                                                                                                        1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                        2 A VGA display controller by Eduardo Sanchez

                                                                                                        3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                        71

                                                                                                        • Pixel
                                                                                                          • 6 SIGNALS IN VGA CONTROLLER
                                                                                                          • Gates
                                                                                                          • Market size
                                                                                                          • 122 FPGA Comparisons

                                                                                                          12 FPGA ARCHITECTURE

                                                                                                          A field-programmable gate array (FPGA) is a semiconductor device that can

                                                                                                          be configured by the customer or designer after manufacturingmdashhence the

                                                                                                          name field-programmable FPGAs are programmed using a logic circuit

                                                                                                          diagram or a source code in a hardware description language (HDL) to specify

                                                                                                          how the chip will work They can be used to implement any logical function

                                                                                                          that an application-specific integrated circuit (ASIC) could perform but the

                                                                                                          ability to update the functionality after shipping offers advantages for many

                                                                                                          applications

                                                                                                          FPGAs contain programmable logic components called logic blocks and a

                                                                                                          hierarchy of reconfigurable interconnects that allow the blocks to be wired

                                                                                                          togethermdashsomewhat like a one-chip programmable breadboard Logic blocks

                                                                                                          can be configured to perform complex combinational functions or merely

                                                                                                          simple logic gates like AND and XOR In most FPGAs the logic blocks also

                                                                                                          include memory elements which may be simple flip-flops or more complete

                                                                                                          blocks of memory

                                                                                                          History The FPGA industry sprouted from programmable read only

                                                                                                          memory (PROM) and programmable logic devices (PLDs) PROMs and PLDs

                                                                                                          both had the option of being programmed in batches in a factory or in the field

                                                                                                          (field programmable) however programmable logic was hard-wired between

                                                                                                          logic gates

                                                                                                          Xilinx Co-Founders Ross Freeman and Bernard Vonderschmitt invented the

                                                                                                          first commercially viable field programmable gate array in 1985 ndash the

                                                                                                          XC2064 The XC2064 had programmable gates and programmable

                                                                                                          interconnects between gates the beginnings of a new technology and market

                                                                                                          The XC2064 boasted a mere 64 configurable logic blocks (CLBs) with two 3-

                                                                                                          input lookup tables (LUTs) More than 20 years later Freeman was entered

                                                                                                          into the National Inventors Hall of Fame for his invention

                                                                                                          53

                                                                                                          Some of the industryrsquos foundational concepts and technologies for

                                                                                                          programmable logic arrays gates and logic blocks are founded in patents

                                                                                                          awarded to David W Page and LuVerne R Peterson in 1985

                                                                                                          In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                                          proposed by Steve Casselman to develop a computer that would implement

                                                                                                          600000 reprogrammable gates Casselman was successful and the system was

                                                                                                          awarded a patent in 1992

                                                                                                          Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                                          1990s when competitors sprouted up eroding significant market-share By

                                                                                                          1993 Actel was serving about 18 percent of the market

                                                                                                          The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                                          and the volume of production In the early 1990s FPGAs were primarily used

                                                                                                          in telecommunications and networking By the end of the decade FPGAs

                                                                                                          found their way into consumer automotive and industrial applications

                                                                                                          FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                                          algorithm technology and FPGAs to create a sound recognition device

                                                                                                          Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                                          to decide the configuration needed to accomplish a sound recognition task

                                                                                                          121 Modern developments

                                                                                                          A recent trend has been to take the coarse-grained architectural

                                                                                                          approach a step further by combining the logic blocks and interconnects of

                                                                                                          traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                                          form a complete system on a programmable chip This work mirrors the

                                                                                                          architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                                          Group which combined a reconfigurable CPU architecture on a single chip

                                                                                                          called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                                          technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                                          which include one or more PowerPC processors embedded within the FPGAs

                                                                                                          54

                                                                                                          logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                                          processor in combination with Atmels programmable logic architecture

                                                                                                          An alternate approach to using hard-macro processors is to make use of soft

                                                                                                          processor cores that are implemented within the FPGA logic (See Soft

                                                                                                          processors below)

                                                                                                          As previously mentioned many modern FPGAs have the ability to be

                                                                                                          reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                                          computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                                          suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                                          example of a reconfigurable soft processor implemented on FPGAs However

                                                                                                          it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                                          to a specific program

                                                                                                          Additionally new non-FPGA architectures are beginning to emerge

                                                                                                          Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                                          hybrid approach by providing an array of processor cores and FPGA-like

                                                                                                          programmable cores on the same chip

                                                                                                          Gates

                                                                                                          1987 9000 gates Xilinx

                                                                                                          1992 600000 Naval Surface Warfare Department

                                                                                                          Early 2000s Millions

                                                                                                          Market size

                                                                                                          1985 First commercial FPGA technology invented by Xilinx

                                                                                                          1987 $14 million

                                                                                                          ~1993 gt$385 million

                                                                                                          2005 $19 billion

                                                                                                          2010 estimates $275 billion

                                                                                                          55

                                                                                                          122 FPGA Comparisons

                                                                                                          Historically FPGAs have been slower less energy efficient and generally

                                                                                                          achieved less functionality than their fixed ASIC counterparts A combination

                                                                                                          of volume fabrication improvements research and development and the IO

                                                                                                          capabilities of new supercomputers have largely closed the performance gap

                                                                                                          between ASICs and FPGAs

                                                                                                          Advantages include a shorter time to market ability to re-program in the field

                                                                                                          to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                                          middle road by developing their hardware on ordinary FPGAs but

                                                                                                          manufacture their final version so it can no longer be modified after the design

                                                                                                          has been committed

                                                                                                          Xilinx claims that several market and technology dynamics are changing the

                                                                                                          ASICFPGA paradigm

                                                                                                          IC costs are rising aggressively

                                                                                                          ASIC complexity has bolstered development time and costs

                                                                                                          RampD resources and headcount is decreasing

                                                                                                          Revenue losses for slow time-to-market are increasing

                                                                                                          Financial constraints in a poor economy are driving low-cost technologies

                                                                                                          These trends make FPGAs a better alternative than ASICs for a growing

                                                                                                          number of higher-volume applications than they have been historically used

                                                                                                          for which the company blames for the growing number of FPGA design starts

                                                                                                          (see History)

                                                                                                          The primary differences between CPLDs and FPGAs are architectural A

                                                                                                          CPLD has a somewhat restrictive structure consisting of one or more

                                                                                                          programmable sum-of-products logic arrays feeding a relatively small number

                                                                                                          of clocked registers The result of this is less flexibility with the advantage of

                                                                                                          more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                                          FPGA architectures on the other hand are dominated by interconnect This

                                                                                                          56

                                                                                                          makes them far more flexible (in terms of the range of designs that are

                                                                                                          practical for implementation within them) but also far more complex to design

                                                                                                          for

                                                                                                          Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                          FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                          and embedded memories as well as to have logic blocks implement decoders

                                                                                                          or mathematical functions

                                                                                                          Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                          portion of the device be re-programmed while other portions continue running

                                                                                                          123 Applications

                                                                                                          Applications of FPGAs include digital signal processing software-defined

                                                                                                          radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                          computer vision speech recognition cryptography bioinformatics computer

                                                                                                          hardware emulation radio astronomy and a growing range of other areas

                                                                                                          FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                          space that of glue logic for PCBs As their size capabilities and speed

                                                                                                          increased they began to take over larger and larger functions to the state where

                                                                                                          some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                          introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                          applications which had traditionally been the sole reserve of DSPs began to

                                                                                                          incorporate FPGAs instead

                                                                                                          FPGAs especially find applications in any area or algorithm that can make use

                                                                                                          of the massive parallelism offered by their architecture One such area is code

                                                                                                          breaking in particular brute-force attack of cryptographic algorithms

                                                                                                          FPGAs are increasingly used in conventional high performance computing

                                                                                                          applications where computational kernels such as FFT or Convolution are

                                                                                                          performed on the FPGA instead of a microprocessor

                                                                                                          57

                                                                                                          The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                          considerable computational throughput even at a low MHz clock rates The

                                                                                                          flexibility of the FPGA allows for even higher performance by trading off

                                                                                                          precision and range in the number format for an increased number of parallel

                                                                                                          arithmetic units This has driven a new type of processing called

                                                                                                          reconfigurable computing where time intensive tasks are offloaded from

                                                                                                          software to FPGAs

                                                                                                          The adoption of FPGAs in high performance computing is currently limited by

                                                                                                          the complexity of FPGA design compared to conventional software and the

                                                                                                          extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                          is necessary after even minor changes to the source code

                                                                                                          Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                          where the volume of production is small For these low-volume applications

                                                                                                          the premium that companies pay in hardware costs per unit for a

                                                                                                          programmable chip is more affordable than the development resources spent

                                                                                                          on creating an ASIC for a low-volume application Today new cost and

                                                                                                          performance dynamics have broadened the range of viable applications

                                                                                                          Architecture

                                                                                                          The most common FPGA architecture consists of an array of configurable

                                                                                                          logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                          channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                          into the height of one row or the width of one column in the array

                                                                                                          An application circuit must be mapped into an FPGA with adequate resources

                                                                                                          While the number of CLBs and IOs required is easily determined from the

                                                                                                          design the number of routing tracks needed may vary considerably even

                                                                                                          among designs with the same amount of logic (For example a crossbar switch

                                                                                                          requires much more routing than a systolic array with the same gate count)

                                                                                                          Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                          of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                          58

                                                                                                          just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                          can be routed This is determined by estimates such as those derived from

                                                                                                          Rents rule or by experiments with existing designs

                                                                                                          The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                          blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                          width of one column as shown below All the routing channels have the same

                                                                                                          width (number of wires)

                                                                                                          121 FPGA structure

                                                                                                          A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                          flip-flop as shown below In recent years manufacturers have started moving

                                                                                                          to 6-input LUTs in their high performance parts claiming increased

                                                                                                          performance

                                                                                                          Typical logic block

                                                                                                          59

                                                                                                          There is only one output which can be either the registered or the unregistered

                                                                                                          LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                          Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                          via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                          and other signals are separately managed

                                                                                                          For this example architecture the locations of the FPGA logic block pins are

                                                                                                          shown below

                                                                                                          Logic Block Pin Locations

                                                                                                          Each input is accessible from one side of the logic block while the output pin

                                                                                                          can connect to routing wires in both the channel to the right and the channel

                                                                                                          below the logic block

                                                                                                          Each logic block output pin can connect to any of the wiring segments in the

                                                                                                          channels adjacent to it

                                                                                                          Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                          channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                          connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                          channel immediately below it

                                                                                                          Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                          spans only one logic block before it terminates in a switch box By turning on

                                                                                                          60

                                                                                                          some of the programmable switches within a switch box longer paths can be

                                                                                                          constructed For higher speed interconnect some FPGA architectures use

                                                                                                          longer routing lines that span multiple logic blocks

                                                                                                          Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                          this architecture when a wire enters a switch box there are three

                                                                                                          programmable switches that allow it to connect to three other wires in adjacent

                                                                                                          channel segments The pattern or topology of switches used in this

                                                                                                          architecture is the planar or domain-based switch box topology In this switch

                                                                                                          box topology a wire in track number one connects only to wires in track

                                                                                                          number one in adjacent channel segments wires in track number 2 connect

                                                                                                          only to other wires in track number 2 and so on The figure below illustrates

                                                                                                          the connections in a switch box

                                                                                                          Switch box topology

                                                                                                          Modern FPGA families expand upon the above capabilities to include higher

                                                                                                          level functionality fixed into the silicon Having these common functions

                                                                                                          embedded into the silicon reduces the area required and gives those functions

                                                                                                          increased speed compared to building them from primitives Examples of these

                                                                                                          include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                          logic and embedded memories

                                                                                                          61

                                                                                                          FPGAs are also widely used for systems validation including pre-silicon

                                                                                                          validation post-silicon validation and firmware development This allows chip

                                                                                                          companies to validate their design before the chip is produced in the factory

                                                                                                          reducing the time to market

                                                                                                          124 FPGA Design and Programming

                                                                                                          To define the behavior of the FPGA the user provides a hardware description

                                                                                                          language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                          with when handling large structures because its possible to just specify them

                                                                                                          numerically rather than having to draw every piece by hand On the other hand

                                                                                                          schematic entry can allow for easier visualisation of a design

                                                                                                          Then using an electronic design automation tool a technology-mapped netlist

                                                                                                          is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                          using a process called place-and-route usually performed by the FPGA

                                                                                                          companys proprietary place-and-route software The user will validate the

                                                                                                          map place and route results via timing analysis simulation and other

                                                                                                          verification methodologies Once the design and validation process is

                                                                                                          complete the binary file generated (also using the FPGA companys

                                                                                                          proprietary software) is used to (re)configure the FPGA

                                                                                                          Going from schematicHDL source files to actual configuration The source

                                                                                                          files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                          different steps will produce a file This file is then transferred to the

                                                                                                          FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                          like an EEPROM

                                                                                                          The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                          reduce the complexity of designing in HDLs which have been compared to the

                                                                                                          equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                          through the introduction of alternative languages

                                                                                                          62

                                                                                                          To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                          predefined complex functions and circuits that have been tested and optimized

                                                                                                          to speed up the design process These predefined circuits are commonly called

                                                                                                          IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                          (rarely free and typically released under proprietary licenses) Other

                                                                                                          predefined circuits are available from developer communities such as Open

                                                                                                          Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                          other sources

                                                                                                          In a typical design flow an FPGA application developer will simulate the

                                                                                                          design at multiple stages throughout the design process Initially the RTL

                                                                                                          description in VHDL or Verilog is simulated by creating test benches to

                                                                                                          simulate the system and observe results Then after the synthesis engine has

                                                                                                          mapped the design to a netlist the netlist is translated to a gate level

                                                                                                          description where simulation is repeated to confirm the synthesis proceeded

                                                                                                          without errors Finally the design is laid out in the FPGA at which point

                                                                                                          propagation delays can be added and the simulation run again with these

                                                                                                          values back-annotated onto the netlist

                                                                                                          Basic Process Technology Types

                                                                                                          SRAM - based on static memory technology In-system programmable and re-

                                                                                                          programmable Requires external boot devices CMOS

                                                                                                          Antifuse - One-time programmable CMOS

                                                                                                          EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                          one-time programmable in production because of plastic packaging

                                                                                                          Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                          EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                          technology Can be erased even in plastic packages Some but not all

                                                                                                          EEPROM devices can be in-system programmed CMOS

                                                                                                          Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                          packages Some but not all flash devices can be in-system programmed

                                                                                                          63

                                                                                                          Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                          therefore less expensive to manufacture CMOS

                                                                                                          Fuse - One-time programmable Bipolar

                                                                                                          Major Manufacturers

                                                                                                          Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                          rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                          representing over 50 percent

                                                                                                          Xilinx also provides free Windows and Linux design software while Altera

                                                                                                          provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                          a rental scheme

                                                                                                          Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                          (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                          Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                          focused CSSP no general purpose FPGAs)

                                                                                                          125 FPGA prototype

                                                                                                          FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                          prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                          hardware verification and early software development

                                                                                                          Main stream verification methods for hardware design and early software and

                                                                                                          firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                          design on FPGA has become a good method to do this

                                                                                                          64

                                                                                                          Reasons why Prototyping is important

                                                                                                          1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                          functionally correct This is compared to designers only relying on software

                                                                                                          simulations to verify that their hardware design is sound Simulation speed and

                                                                                                          modeling accuracy limitations hinder this development

                                                                                                          2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                          the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                          time in area of software development and testing at the software-hardware

                                                                                                          integration stage This allows many unforeseen software bugs that appear due

                                                                                                          to todays array of operating systems applications and hardware

                                                                                                          3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                          system work well together off the simulation stage and in actual form

                                                                                                          4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                          bringing in interest early This speeds up the overall development cycle and

                                                                                                          allows for more enhancement or improvement to the chip features as it would

                                                                                                          otherwise have been

                                                                                                          65

                                                                                                          13 SIMULATION RESULTS

                                                                                                          66

                                                                                                          GATE LEVEL

                                                                                                          FigVGA CONTROLLER

                                                                                                          67

                                                                                                          GATE LEVEL

                                                                                                          Fig SYNC UNIT

                                                                                                          68

                                                                                                          TECHNOLOGY SCHEMATIC

                                                                                                          69

                                                                                                          14 APPLICATIONS

                                                                                                          Motion Tracking Systems

                                                                                                          Image processing systems

                                                                                                          Displaying Systems

                                                                                                          141 CONCLUTION

                                                                                                          Hence by implementing FPGA taking a processor-generated picture

                                                                                                          (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                          each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                          to correctly display the picture on a connected monitor

                                                                                                          70

                                                                                                          REFERENCES

                                                                                                          1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                          2 A VGA display controller by Eduardo Sanchez

                                                                                                          3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                          71

                                                                                                          • Pixel
                                                                                                            • 6 SIGNALS IN VGA CONTROLLER
                                                                                                            • Gates
                                                                                                            • Market size
                                                                                                            • 122 FPGA Comparisons

                                                                                                            Some of the industryrsquos foundational concepts and technologies for

                                                                                                            programmable logic arrays gates and logic blocks are founded in patents

                                                                                                            awarded to David W Page and LuVerne R Peterson in 1985

                                                                                                            In the late 1980s the Naval Surface Warfare Department funded an experiment

                                                                                                            proposed by Steve Casselman to develop a computer that would implement

                                                                                                            600000 reprogrammable gates Casselman was successful and the system was

                                                                                                            awarded a patent in 1992

                                                                                                            Xilinx continued unchallenged and quickly growing from 1985 to the mid-

                                                                                                            1990s when competitors sprouted up eroding significant market-share By

                                                                                                            1993 Actel was serving about 18 percent of the market

                                                                                                            The 1990s were an explosive period of time for FPGAs both in sophistication

                                                                                                            and the volume of production In the early 1990s FPGAs were primarily used

                                                                                                            in telecommunications and networking By the end of the decade FPGAs

                                                                                                            found their way into consumer automotive and industrial applications

                                                                                                            FPGAs got a glimpse of fame in 1997 when Adrian Thompson merged genetic

                                                                                                            algorithm technology and FPGAs to create a sound recognition device

                                                                                                            Thomsonrsquos algorithm allowed an array of 64 x 64 cells in a Xilinx FPGA chip

                                                                                                            to decide the configuration needed to accomplish a sound recognition task

                                                                                                            121 Modern developments

                                                                                                            A recent trend has been to take the coarse-grained architectural

                                                                                                            approach a step further by combining the logic blocks and interconnects of

                                                                                                            traditional FPGAs with embedded microprocessors and related peripherals to

                                                                                                            form a complete system on a programmable chip This work mirrors the

                                                                                                            architecture by Ron Perlof and Hana Potash of Burroughs Advanced Systems

                                                                                                            Group which combined a reconfigurable CPU architecture on a single chip

                                                                                                            called the SB24 That work was done in 1982 Examples of such hybrid

                                                                                                            technologies can be found in the Xilinx Virtex-II PRO and Virtex-4 devices

                                                                                                            which include one or more PowerPC processors embedded within the FPGAs

                                                                                                            54

                                                                                                            logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                                            processor in combination with Atmels programmable logic architecture

                                                                                                            An alternate approach to using hard-macro processors is to make use of soft

                                                                                                            processor cores that are implemented within the FPGA logic (See Soft

                                                                                                            processors below)

                                                                                                            As previously mentioned many modern FPGAs have the ability to be

                                                                                                            reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                                            computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                                            suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                                            example of a reconfigurable soft processor implemented on FPGAs However

                                                                                                            it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                                            to a specific program

                                                                                                            Additionally new non-FPGA architectures are beginning to emerge

                                                                                                            Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                                            hybrid approach by providing an array of processor cores and FPGA-like

                                                                                                            programmable cores on the same chip

                                                                                                            Gates

                                                                                                            1987 9000 gates Xilinx

                                                                                                            1992 600000 Naval Surface Warfare Department

                                                                                                            Early 2000s Millions

                                                                                                            Market size

                                                                                                            1985 First commercial FPGA technology invented by Xilinx

                                                                                                            1987 $14 million

                                                                                                            ~1993 gt$385 million

                                                                                                            2005 $19 billion

                                                                                                            2010 estimates $275 billion

                                                                                                            55

                                                                                                            122 FPGA Comparisons

                                                                                                            Historically FPGAs have been slower less energy efficient and generally

                                                                                                            achieved less functionality than their fixed ASIC counterparts A combination

                                                                                                            of volume fabrication improvements research and development and the IO

                                                                                                            capabilities of new supercomputers have largely closed the performance gap

                                                                                                            between ASICs and FPGAs

                                                                                                            Advantages include a shorter time to market ability to re-program in the field

                                                                                                            to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                                            middle road by developing their hardware on ordinary FPGAs but

                                                                                                            manufacture their final version so it can no longer be modified after the design

                                                                                                            has been committed

                                                                                                            Xilinx claims that several market and technology dynamics are changing the

                                                                                                            ASICFPGA paradigm

                                                                                                            IC costs are rising aggressively

                                                                                                            ASIC complexity has bolstered development time and costs

                                                                                                            RampD resources and headcount is decreasing

                                                                                                            Revenue losses for slow time-to-market are increasing

                                                                                                            Financial constraints in a poor economy are driving low-cost technologies

                                                                                                            These trends make FPGAs a better alternative than ASICs for a growing

                                                                                                            number of higher-volume applications than they have been historically used

                                                                                                            for which the company blames for the growing number of FPGA design starts

                                                                                                            (see History)

                                                                                                            The primary differences between CPLDs and FPGAs are architectural A

                                                                                                            CPLD has a somewhat restrictive structure consisting of one or more

                                                                                                            programmable sum-of-products logic arrays feeding a relatively small number

                                                                                                            of clocked registers The result of this is less flexibility with the advantage of

                                                                                                            more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                                            FPGA architectures on the other hand are dominated by interconnect This

                                                                                                            56

                                                                                                            makes them far more flexible (in terms of the range of designs that are

                                                                                                            practical for implementation within them) but also far more complex to design

                                                                                                            for

                                                                                                            Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                            FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                            and embedded memories as well as to have logic blocks implement decoders

                                                                                                            or mathematical functions

                                                                                                            Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                            portion of the device be re-programmed while other portions continue running

                                                                                                            123 Applications

                                                                                                            Applications of FPGAs include digital signal processing software-defined

                                                                                                            radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                            computer vision speech recognition cryptography bioinformatics computer

                                                                                                            hardware emulation radio astronomy and a growing range of other areas

                                                                                                            FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                            space that of glue logic for PCBs As their size capabilities and speed

                                                                                                            increased they began to take over larger and larger functions to the state where

                                                                                                            some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                            introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                            applications which had traditionally been the sole reserve of DSPs began to

                                                                                                            incorporate FPGAs instead

                                                                                                            FPGAs especially find applications in any area or algorithm that can make use

                                                                                                            of the massive parallelism offered by their architecture One such area is code

                                                                                                            breaking in particular brute-force attack of cryptographic algorithms

                                                                                                            FPGAs are increasingly used in conventional high performance computing

                                                                                                            applications where computational kernels such as FFT or Convolution are

                                                                                                            performed on the FPGA instead of a microprocessor

                                                                                                            57

                                                                                                            The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                            considerable computational throughput even at a low MHz clock rates The

                                                                                                            flexibility of the FPGA allows for even higher performance by trading off

                                                                                                            precision and range in the number format for an increased number of parallel

                                                                                                            arithmetic units This has driven a new type of processing called

                                                                                                            reconfigurable computing where time intensive tasks are offloaded from

                                                                                                            software to FPGAs

                                                                                                            The adoption of FPGAs in high performance computing is currently limited by

                                                                                                            the complexity of FPGA design compared to conventional software and the

                                                                                                            extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                            is necessary after even minor changes to the source code

                                                                                                            Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                            where the volume of production is small For these low-volume applications

                                                                                                            the premium that companies pay in hardware costs per unit for a

                                                                                                            programmable chip is more affordable than the development resources spent

                                                                                                            on creating an ASIC for a low-volume application Today new cost and

                                                                                                            performance dynamics have broadened the range of viable applications

                                                                                                            Architecture

                                                                                                            The most common FPGA architecture consists of an array of configurable

                                                                                                            logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                            channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                            into the height of one row or the width of one column in the array

                                                                                                            An application circuit must be mapped into an FPGA with adequate resources

                                                                                                            While the number of CLBs and IOs required is easily determined from the

                                                                                                            design the number of routing tracks needed may vary considerably even

                                                                                                            among designs with the same amount of logic (For example a crossbar switch

                                                                                                            requires much more routing than a systolic array with the same gate count)

                                                                                                            Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                            of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                            58

                                                                                                            just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                            can be routed This is determined by estimates such as those derived from

                                                                                                            Rents rule or by experiments with existing designs

                                                                                                            The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                            blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                            width of one column as shown below All the routing channels have the same

                                                                                                            width (number of wires)

                                                                                                            121 FPGA structure

                                                                                                            A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                            flip-flop as shown below In recent years manufacturers have started moving

                                                                                                            to 6-input LUTs in their high performance parts claiming increased

                                                                                                            performance

                                                                                                            Typical logic block

                                                                                                            59

                                                                                                            There is only one output which can be either the registered or the unregistered

                                                                                                            LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                            Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                            via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                            and other signals are separately managed

                                                                                                            For this example architecture the locations of the FPGA logic block pins are

                                                                                                            shown below

                                                                                                            Logic Block Pin Locations

                                                                                                            Each input is accessible from one side of the logic block while the output pin

                                                                                                            can connect to routing wires in both the channel to the right and the channel

                                                                                                            below the logic block

                                                                                                            Each logic block output pin can connect to any of the wiring segments in the

                                                                                                            channels adjacent to it

                                                                                                            Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                            channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                            connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                            channel immediately below it

                                                                                                            Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                            spans only one logic block before it terminates in a switch box By turning on

                                                                                                            60

                                                                                                            some of the programmable switches within a switch box longer paths can be

                                                                                                            constructed For higher speed interconnect some FPGA architectures use

                                                                                                            longer routing lines that span multiple logic blocks

                                                                                                            Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                            this architecture when a wire enters a switch box there are three

                                                                                                            programmable switches that allow it to connect to three other wires in adjacent

                                                                                                            channel segments The pattern or topology of switches used in this

                                                                                                            architecture is the planar or domain-based switch box topology In this switch

                                                                                                            box topology a wire in track number one connects only to wires in track

                                                                                                            number one in adjacent channel segments wires in track number 2 connect

                                                                                                            only to other wires in track number 2 and so on The figure below illustrates

                                                                                                            the connections in a switch box

                                                                                                            Switch box topology

                                                                                                            Modern FPGA families expand upon the above capabilities to include higher

                                                                                                            level functionality fixed into the silicon Having these common functions

                                                                                                            embedded into the silicon reduces the area required and gives those functions

                                                                                                            increased speed compared to building them from primitives Examples of these

                                                                                                            include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                            logic and embedded memories

                                                                                                            61

                                                                                                            FPGAs are also widely used for systems validation including pre-silicon

                                                                                                            validation post-silicon validation and firmware development This allows chip

                                                                                                            companies to validate their design before the chip is produced in the factory

                                                                                                            reducing the time to market

                                                                                                            124 FPGA Design and Programming

                                                                                                            To define the behavior of the FPGA the user provides a hardware description

                                                                                                            language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                            with when handling large structures because its possible to just specify them

                                                                                                            numerically rather than having to draw every piece by hand On the other hand

                                                                                                            schematic entry can allow for easier visualisation of a design

                                                                                                            Then using an electronic design automation tool a technology-mapped netlist

                                                                                                            is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                            using a process called place-and-route usually performed by the FPGA

                                                                                                            companys proprietary place-and-route software The user will validate the

                                                                                                            map place and route results via timing analysis simulation and other

                                                                                                            verification methodologies Once the design and validation process is

                                                                                                            complete the binary file generated (also using the FPGA companys

                                                                                                            proprietary software) is used to (re)configure the FPGA

                                                                                                            Going from schematicHDL source files to actual configuration The source

                                                                                                            files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                            different steps will produce a file This file is then transferred to the

                                                                                                            FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                            like an EEPROM

                                                                                                            The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                            reduce the complexity of designing in HDLs which have been compared to the

                                                                                                            equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                            through the introduction of alternative languages

                                                                                                            62

                                                                                                            To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                            predefined complex functions and circuits that have been tested and optimized

                                                                                                            to speed up the design process These predefined circuits are commonly called

                                                                                                            IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                            (rarely free and typically released under proprietary licenses) Other

                                                                                                            predefined circuits are available from developer communities such as Open

                                                                                                            Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                            other sources

                                                                                                            In a typical design flow an FPGA application developer will simulate the

                                                                                                            design at multiple stages throughout the design process Initially the RTL

                                                                                                            description in VHDL or Verilog is simulated by creating test benches to

                                                                                                            simulate the system and observe results Then after the synthesis engine has

                                                                                                            mapped the design to a netlist the netlist is translated to a gate level

                                                                                                            description where simulation is repeated to confirm the synthesis proceeded

                                                                                                            without errors Finally the design is laid out in the FPGA at which point

                                                                                                            propagation delays can be added and the simulation run again with these

                                                                                                            values back-annotated onto the netlist

                                                                                                            Basic Process Technology Types

                                                                                                            SRAM - based on static memory technology In-system programmable and re-

                                                                                                            programmable Requires external boot devices CMOS

                                                                                                            Antifuse - One-time programmable CMOS

                                                                                                            EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                            one-time programmable in production because of plastic packaging

                                                                                                            Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                            EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                            technology Can be erased even in plastic packages Some but not all

                                                                                                            EEPROM devices can be in-system programmed CMOS

                                                                                                            Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                            packages Some but not all flash devices can be in-system programmed

                                                                                                            63

                                                                                                            Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                            therefore less expensive to manufacture CMOS

                                                                                                            Fuse - One-time programmable Bipolar

                                                                                                            Major Manufacturers

                                                                                                            Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                            rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                            representing over 50 percent

                                                                                                            Xilinx also provides free Windows and Linux design software while Altera

                                                                                                            provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                            a rental scheme

                                                                                                            Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                            (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                            Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                            focused CSSP no general purpose FPGAs)

                                                                                                            125 FPGA prototype

                                                                                                            FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                            prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                            hardware verification and early software development

                                                                                                            Main stream verification methods for hardware design and early software and

                                                                                                            firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                            design on FPGA has become a good method to do this

                                                                                                            64

                                                                                                            Reasons why Prototyping is important

                                                                                                            1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                            functionally correct This is compared to designers only relying on software

                                                                                                            simulations to verify that their hardware design is sound Simulation speed and

                                                                                                            modeling accuracy limitations hinder this development

                                                                                                            2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                            the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                            time in area of software development and testing at the software-hardware

                                                                                                            integration stage This allows many unforeseen software bugs that appear due

                                                                                                            to todays array of operating systems applications and hardware

                                                                                                            3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                            system work well together off the simulation stage and in actual form

                                                                                                            4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                            bringing in interest early This speeds up the overall development cycle and

                                                                                                            allows for more enhancement or improvement to the chip features as it would

                                                                                                            otherwise have been

                                                                                                            65

                                                                                                            13 SIMULATION RESULTS

                                                                                                            66

                                                                                                            GATE LEVEL

                                                                                                            FigVGA CONTROLLER

                                                                                                            67

                                                                                                            GATE LEVEL

                                                                                                            Fig SYNC UNIT

                                                                                                            68

                                                                                                            TECHNOLOGY SCHEMATIC

                                                                                                            69

                                                                                                            14 APPLICATIONS

                                                                                                            Motion Tracking Systems

                                                                                                            Image processing systems

                                                                                                            Displaying Systems

                                                                                                            141 CONCLUTION

                                                                                                            Hence by implementing FPGA taking a processor-generated picture

                                                                                                            (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                            each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                            to correctly display the picture on a connected monitor

                                                                                                            70

                                                                                                            REFERENCES

                                                                                                            1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                            2 A VGA display controller by Eduardo Sanchez

                                                                                                            3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                            71

                                                                                                            • Pixel
                                                                                                              • 6 SIGNALS IN VGA CONTROLLER
                                                                                                              • Gates
                                                                                                              • Market size
                                                                                                              • 122 FPGA Comparisons

                                                                                                              logic fabric The Atmel FPSLIC is another such device which uses an AVR

                                                                                                              processor in combination with Atmels programmable logic architecture

                                                                                                              An alternate approach to using hard-macro processors is to make use of soft

                                                                                                              processor cores that are implemented within the FPGA logic (See Soft

                                                                                                              processors below)

                                                                                                              As previously mentioned many modern FPGAs have the ability to be

                                                                                                              reprogrammed at run time and this is leading to the idea of reconfigurable

                                                                                                              computing or reconfigurable systems mdash CPUs that reconfigure themselves to

                                                                                                              suit the task at hand The Mitrion Virtual Processor from Mitrionics is an

                                                                                                              example of a reconfigurable soft processor implemented on FPGAs However

                                                                                                              it does not support dynamic reconfiguration at runtime but instead adapts itself

                                                                                                              to a specific program

                                                                                                              Additionally new non-FPGA architectures are beginning to emerge

                                                                                                              Software-configurable microprocessors such as the Stretch S5000 adopt a

                                                                                                              hybrid approach by providing an array of processor cores and FPGA-like

                                                                                                              programmable cores on the same chip

                                                                                                              Gates

                                                                                                              1987 9000 gates Xilinx

                                                                                                              1992 600000 Naval Surface Warfare Department

                                                                                                              Early 2000s Millions

                                                                                                              Market size

                                                                                                              1985 First commercial FPGA technology invented by Xilinx

                                                                                                              1987 $14 million

                                                                                                              ~1993 gt$385 million

                                                                                                              2005 $19 billion

                                                                                                              2010 estimates $275 billion

                                                                                                              55

                                                                                                              122 FPGA Comparisons

                                                                                                              Historically FPGAs have been slower less energy efficient and generally

                                                                                                              achieved less functionality than their fixed ASIC counterparts A combination

                                                                                                              of volume fabrication improvements research and development and the IO

                                                                                                              capabilities of new supercomputers have largely closed the performance gap

                                                                                                              between ASICs and FPGAs

                                                                                                              Advantages include a shorter time to market ability to re-program in the field

                                                                                                              to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                                              middle road by developing their hardware on ordinary FPGAs but

                                                                                                              manufacture their final version so it can no longer be modified after the design

                                                                                                              has been committed

                                                                                                              Xilinx claims that several market and technology dynamics are changing the

                                                                                                              ASICFPGA paradigm

                                                                                                              IC costs are rising aggressively

                                                                                                              ASIC complexity has bolstered development time and costs

                                                                                                              RampD resources and headcount is decreasing

                                                                                                              Revenue losses for slow time-to-market are increasing

                                                                                                              Financial constraints in a poor economy are driving low-cost technologies

                                                                                                              These trends make FPGAs a better alternative than ASICs for a growing

                                                                                                              number of higher-volume applications than they have been historically used

                                                                                                              for which the company blames for the growing number of FPGA design starts

                                                                                                              (see History)

                                                                                                              The primary differences between CPLDs and FPGAs are architectural A

                                                                                                              CPLD has a somewhat restrictive structure consisting of one or more

                                                                                                              programmable sum-of-products logic arrays feeding a relatively small number

                                                                                                              of clocked registers The result of this is less flexibility with the advantage of

                                                                                                              more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                                              FPGA architectures on the other hand are dominated by interconnect This

                                                                                                              56

                                                                                                              makes them far more flexible (in terms of the range of designs that are

                                                                                                              practical for implementation within them) but also far more complex to design

                                                                                                              for

                                                                                                              Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                              FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                              and embedded memories as well as to have logic blocks implement decoders

                                                                                                              or mathematical functions

                                                                                                              Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                              portion of the device be re-programmed while other portions continue running

                                                                                                              123 Applications

                                                                                                              Applications of FPGAs include digital signal processing software-defined

                                                                                                              radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                              computer vision speech recognition cryptography bioinformatics computer

                                                                                                              hardware emulation radio astronomy and a growing range of other areas

                                                                                                              FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                              space that of glue logic for PCBs As their size capabilities and speed

                                                                                                              increased they began to take over larger and larger functions to the state where

                                                                                                              some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                              introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                              applications which had traditionally been the sole reserve of DSPs began to

                                                                                                              incorporate FPGAs instead

                                                                                                              FPGAs especially find applications in any area or algorithm that can make use

                                                                                                              of the massive parallelism offered by their architecture One such area is code

                                                                                                              breaking in particular brute-force attack of cryptographic algorithms

                                                                                                              FPGAs are increasingly used in conventional high performance computing

                                                                                                              applications where computational kernels such as FFT or Convolution are

                                                                                                              performed on the FPGA instead of a microprocessor

                                                                                                              57

                                                                                                              The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                              considerable computational throughput even at a low MHz clock rates The

                                                                                                              flexibility of the FPGA allows for even higher performance by trading off

                                                                                                              precision and range in the number format for an increased number of parallel

                                                                                                              arithmetic units This has driven a new type of processing called

                                                                                                              reconfigurable computing where time intensive tasks are offloaded from

                                                                                                              software to FPGAs

                                                                                                              The adoption of FPGAs in high performance computing is currently limited by

                                                                                                              the complexity of FPGA design compared to conventional software and the

                                                                                                              extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                              is necessary after even minor changes to the source code

                                                                                                              Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                              where the volume of production is small For these low-volume applications

                                                                                                              the premium that companies pay in hardware costs per unit for a

                                                                                                              programmable chip is more affordable than the development resources spent

                                                                                                              on creating an ASIC for a low-volume application Today new cost and

                                                                                                              performance dynamics have broadened the range of viable applications

                                                                                                              Architecture

                                                                                                              The most common FPGA architecture consists of an array of configurable

                                                                                                              logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                              channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                              into the height of one row or the width of one column in the array

                                                                                                              An application circuit must be mapped into an FPGA with adequate resources

                                                                                                              While the number of CLBs and IOs required is easily determined from the

                                                                                                              design the number of routing tracks needed may vary considerably even

                                                                                                              among designs with the same amount of logic (For example a crossbar switch

                                                                                                              requires much more routing than a systolic array with the same gate count)

                                                                                                              Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                              of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                              58

                                                                                                              just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                              can be routed This is determined by estimates such as those derived from

                                                                                                              Rents rule or by experiments with existing designs

                                                                                                              The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                              blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                              width of one column as shown below All the routing channels have the same

                                                                                                              width (number of wires)

                                                                                                              121 FPGA structure

                                                                                                              A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                              flip-flop as shown below In recent years manufacturers have started moving

                                                                                                              to 6-input LUTs in their high performance parts claiming increased

                                                                                                              performance

                                                                                                              Typical logic block

                                                                                                              59

                                                                                                              There is only one output which can be either the registered or the unregistered

                                                                                                              LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                              Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                              via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                              and other signals are separately managed

                                                                                                              For this example architecture the locations of the FPGA logic block pins are

                                                                                                              shown below

                                                                                                              Logic Block Pin Locations

                                                                                                              Each input is accessible from one side of the logic block while the output pin

                                                                                                              can connect to routing wires in both the channel to the right and the channel

                                                                                                              below the logic block

                                                                                                              Each logic block output pin can connect to any of the wiring segments in the

                                                                                                              channels adjacent to it

                                                                                                              Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                              channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                              connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                              channel immediately below it

                                                                                                              Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                              spans only one logic block before it terminates in a switch box By turning on

                                                                                                              60

                                                                                                              some of the programmable switches within a switch box longer paths can be

                                                                                                              constructed For higher speed interconnect some FPGA architectures use

                                                                                                              longer routing lines that span multiple logic blocks

                                                                                                              Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                              this architecture when a wire enters a switch box there are three

                                                                                                              programmable switches that allow it to connect to three other wires in adjacent

                                                                                                              channel segments The pattern or topology of switches used in this

                                                                                                              architecture is the planar or domain-based switch box topology In this switch

                                                                                                              box topology a wire in track number one connects only to wires in track

                                                                                                              number one in adjacent channel segments wires in track number 2 connect

                                                                                                              only to other wires in track number 2 and so on The figure below illustrates

                                                                                                              the connections in a switch box

                                                                                                              Switch box topology

                                                                                                              Modern FPGA families expand upon the above capabilities to include higher

                                                                                                              level functionality fixed into the silicon Having these common functions

                                                                                                              embedded into the silicon reduces the area required and gives those functions

                                                                                                              increased speed compared to building them from primitives Examples of these

                                                                                                              include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                              logic and embedded memories

                                                                                                              61

                                                                                                              FPGAs are also widely used for systems validation including pre-silicon

                                                                                                              validation post-silicon validation and firmware development This allows chip

                                                                                                              companies to validate their design before the chip is produced in the factory

                                                                                                              reducing the time to market

                                                                                                              124 FPGA Design and Programming

                                                                                                              To define the behavior of the FPGA the user provides a hardware description

                                                                                                              language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                              with when handling large structures because its possible to just specify them

                                                                                                              numerically rather than having to draw every piece by hand On the other hand

                                                                                                              schematic entry can allow for easier visualisation of a design

                                                                                                              Then using an electronic design automation tool a technology-mapped netlist

                                                                                                              is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                              using a process called place-and-route usually performed by the FPGA

                                                                                                              companys proprietary place-and-route software The user will validate the

                                                                                                              map place and route results via timing analysis simulation and other

                                                                                                              verification methodologies Once the design and validation process is

                                                                                                              complete the binary file generated (also using the FPGA companys

                                                                                                              proprietary software) is used to (re)configure the FPGA

                                                                                                              Going from schematicHDL source files to actual configuration The source

                                                                                                              files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                              different steps will produce a file This file is then transferred to the

                                                                                                              FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                              like an EEPROM

                                                                                                              The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                              reduce the complexity of designing in HDLs which have been compared to the

                                                                                                              equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                              through the introduction of alternative languages

                                                                                                              62

                                                                                                              To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                              predefined complex functions and circuits that have been tested and optimized

                                                                                                              to speed up the design process These predefined circuits are commonly called

                                                                                                              IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                              (rarely free and typically released under proprietary licenses) Other

                                                                                                              predefined circuits are available from developer communities such as Open

                                                                                                              Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                              other sources

                                                                                                              In a typical design flow an FPGA application developer will simulate the

                                                                                                              design at multiple stages throughout the design process Initially the RTL

                                                                                                              description in VHDL or Verilog is simulated by creating test benches to

                                                                                                              simulate the system and observe results Then after the synthesis engine has

                                                                                                              mapped the design to a netlist the netlist is translated to a gate level

                                                                                                              description where simulation is repeated to confirm the synthesis proceeded

                                                                                                              without errors Finally the design is laid out in the FPGA at which point

                                                                                                              propagation delays can be added and the simulation run again with these

                                                                                                              values back-annotated onto the netlist

                                                                                                              Basic Process Technology Types

                                                                                                              SRAM - based on static memory technology In-system programmable and re-

                                                                                                              programmable Requires external boot devices CMOS

                                                                                                              Antifuse - One-time programmable CMOS

                                                                                                              EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                              one-time programmable in production because of plastic packaging

                                                                                                              Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                              EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                              technology Can be erased even in plastic packages Some but not all

                                                                                                              EEPROM devices can be in-system programmed CMOS

                                                                                                              Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                              packages Some but not all flash devices can be in-system programmed

                                                                                                              63

                                                                                                              Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                              therefore less expensive to manufacture CMOS

                                                                                                              Fuse - One-time programmable Bipolar

                                                                                                              Major Manufacturers

                                                                                                              Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                              rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                              representing over 50 percent

                                                                                                              Xilinx also provides free Windows and Linux design software while Altera

                                                                                                              provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                              a rental scheme

                                                                                                              Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                              (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                              Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                              focused CSSP no general purpose FPGAs)

                                                                                                              125 FPGA prototype

                                                                                                              FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                              prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                              hardware verification and early software development

                                                                                                              Main stream verification methods for hardware design and early software and

                                                                                                              firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                              design on FPGA has become a good method to do this

                                                                                                              64

                                                                                                              Reasons why Prototyping is important

                                                                                                              1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                              functionally correct This is compared to designers only relying on software

                                                                                                              simulations to verify that their hardware design is sound Simulation speed and

                                                                                                              modeling accuracy limitations hinder this development

                                                                                                              2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                              the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                              time in area of software development and testing at the software-hardware

                                                                                                              integration stage This allows many unforeseen software bugs that appear due

                                                                                                              to todays array of operating systems applications and hardware

                                                                                                              3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                              system work well together off the simulation stage and in actual form

                                                                                                              4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                              bringing in interest early This speeds up the overall development cycle and

                                                                                                              allows for more enhancement or improvement to the chip features as it would

                                                                                                              otherwise have been

                                                                                                              65

                                                                                                              13 SIMULATION RESULTS

                                                                                                              66

                                                                                                              GATE LEVEL

                                                                                                              FigVGA CONTROLLER

                                                                                                              67

                                                                                                              GATE LEVEL

                                                                                                              Fig SYNC UNIT

                                                                                                              68

                                                                                                              TECHNOLOGY SCHEMATIC

                                                                                                              69

                                                                                                              14 APPLICATIONS

                                                                                                              Motion Tracking Systems

                                                                                                              Image processing systems

                                                                                                              Displaying Systems

                                                                                                              141 CONCLUTION

                                                                                                              Hence by implementing FPGA taking a processor-generated picture

                                                                                                              (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                              each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                              to correctly display the picture on a connected monitor

                                                                                                              70

                                                                                                              REFERENCES

                                                                                                              1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                              2 A VGA display controller by Eduardo Sanchez

                                                                                                              3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                              71

                                                                                                              • Pixel
                                                                                                                • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                • Gates
                                                                                                                • Market size
                                                                                                                • 122 FPGA Comparisons

                                                                                                                122 FPGA Comparisons

                                                                                                                Historically FPGAs have been slower less energy efficient and generally

                                                                                                                achieved less functionality than their fixed ASIC counterparts A combination

                                                                                                                of volume fabrication improvements research and development and the IO

                                                                                                                capabilities of new supercomputers have largely closed the performance gap

                                                                                                                between ASICs and FPGAs

                                                                                                                Advantages include a shorter time to market ability to re-program in the field

                                                                                                                to fix bugs and lower non-recurring engineering costs Vendors can also take a

                                                                                                                middle road by developing their hardware on ordinary FPGAs but

                                                                                                                manufacture their final version so it can no longer be modified after the design

                                                                                                                has been committed

                                                                                                                Xilinx claims that several market and technology dynamics are changing the

                                                                                                                ASICFPGA paradigm

                                                                                                                IC costs are rising aggressively

                                                                                                                ASIC complexity has bolstered development time and costs

                                                                                                                RampD resources and headcount is decreasing

                                                                                                                Revenue losses for slow time-to-market are increasing

                                                                                                                Financial constraints in a poor economy are driving low-cost technologies

                                                                                                                These trends make FPGAs a better alternative than ASICs for a growing

                                                                                                                number of higher-volume applications than they have been historically used

                                                                                                                for which the company blames for the growing number of FPGA design starts

                                                                                                                (see History)

                                                                                                                The primary differences between CPLDs and FPGAs are architectural A

                                                                                                                CPLD has a somewhat restrictive structure consisting of one or more

                                                                                                                programmable sum-of-products logic arrays feeding a relatively small number

                                                                                                                of clocked registers The result of this is less flexibility with the advantage of

                                                                                                                more predictable timing delays and a higher logic-to-interconnect ratio The

                                                                                                                FPGA architectures on the other hand are dominated by interconnect This

                                                                                                                56

                                                                                                                makes them far more flexible (in terms of the range of designs that are

                                                                                                                practical for implementation within them) but also far more complex to design

                                                                                                                for

                                                                                                                Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                                FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                                and embedded memories as well as to have logic blocks implement decoders

                                                                                                                or mathematical functions

                                                                                                                Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                                portion of the device be re-programmed while other portions continue running

                                                                                                                123 Applications

                                                                                                                Applications of FPGAs include digital signal processing software-defined

                                                                                                                radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                                computer vision speech recognition cryptography bioinformatics computer

                                                                                                                hardware emulation radio astronomy and a growing range of other areas

                                                                                                                FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                                space that of glue logic for PCBs As their size capabilities and speed

                                                                                                                increased they began to take over larger and larger functions to the state where

                                                                                                                some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                                introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                                applications which had traditionally been the sole reserve of DSPs began to

                                                                                                                incorporate FPGAs instead

                                                                                                                FPGAs especially find applications in any area or algorithm that can make use

                                                                                                                of the massive parallelism offered by their architecture One such area is code

                                                                                                                breaking in particular brute-force attack of cryptographic algorithms

                                                                                                                FPGAs are increasingly used in conventional high performance computing

                                                                                                                applications where computational kernels such as FFT or Convolution are

                                                                                                                performed on the FPGA instead of a microprocessor

                                                                                                                57

                                                                                                                The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                                considerable computational throughput even at a low MHz clock rates The

                                                                                                                flexibility of the FPGA allows for even higher performance by trading off

                                                                                                                precision and range in the number format for an increased number of parallel

                                                                                                                arithmetic units This has driven a new type of processing called

                                                                                                                reconfigurable computing where time intensive tasks are offloaded from

                                                                                                                software to FPGAs

                                                                                                                The adoption of FPGAs in high performance computing is currently limited by

                                                                                                                the complexity of FPGA design compared to conventional software and the

                                                                                                                extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                                is necessary after even minor changes to the source code

                                                                                                                Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                                where the volume of production is small For these low-volume applications

                                                                                                                the premium that companies pay in hardware costs per unit for a

                                                                                                                programmable chip is more affordable than the development resources spent

                                                                                                                on creating an ASIC for a low-volume application Today new cost and

                                                                                                                performance dynamics have broadened the range of viable applications

                                                                                                                Architecture

                                                                                                                The most common FPGA architecture consists of an array of configurable

                                                                                                                logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                                channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                                into the height of one row or the width of one column in the array

                                                                                                                An application circuit must be mapped into an FPGA with adequate resources

                                                                                                                While the number of CLBs and IOs required is easily determined from the

                                                                                                                design the number of routing tracks needed may vary considerably even

                                                                                                                among designs with the same amount of logic (For example a crossbar switch

                                                                                                                requires much more routing than a systolic array with the same gate count)

                                                                                                                Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                                of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                                58

                                                                                                                just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                                can be routed This is determined by estimates such as those derived from

                                                                                                                Rents rule or by experiments with existing designs

                                                                                                                The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                                blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                                width of one column as shown below All the routing channels have the same

                                                                                                                width (number of wires)

                                                                                                                121 FPGA structure

                                                                                                                A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                                flip-flop as shown below In recent years manufacturers have started moving

                                                                                                                to 6-input LUTs in their high performance parts claiming increased

                                                                                                                performance

                                                                                                                Typical logic block

                                                                                                                59

                                                                                                                There is only one output which can be either the registered or the unregistered

                                                                                                                LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                                Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                                via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                                and other signals are separately managed

                                                                                                                For this example architecture the locations of the FPGA logic block pins are

                                                                                                                shown below

                                                                                                                Logic Block Pin Locations

                                                                                                                Each input is accessible from one side of the logic block while the output pin

                                                                                                                can connect to routing wires in both the channel to the right and the channel

                                                                                                                below the logic block

                                                                                                                Each logic block output pin can connect to any of the wiring segments in the

                                                                                                                channels adjacent to it

                                                                                                                Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                                channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                                connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                                channel immediately below it

                                                                                                                Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                                spans only one logic block before it terminates in a switch box By turning on

                                                                                                                60

                                                                                                                some of the programmable switches within a switch box longer paths can be

                                                                                                                constructed For higher speed interconnect some FPGA architectures use

                                                                                                                longer routing lines that span multiple logic blocks

                                                                                                                Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                                this architecture when a wire enters a switch box there are three

                                                                                                                programmable switches that allow it to connect to three other wires in adjacent

                                                                                                                channel segments The pattern or topology of switches used in this

                                                                                                                architecture is the planar or domain-based switch box topology In this switch

                                                                                                                box topology a wire in track number one connects only to wires in track

                                                                                                                number one in adjacent channel segments wires in track number 2 connect

                                                                                                                only to other wires in track number 2 and so on The figure below illustrates

                                                                                                                the connections in a switch box

                                                                                                                Switch box topology

                                                                                                                Modern FPGA families expand upon the above capabilities to include higher

                                                                                                                level functionality fixed into the silicon Having these common functions

                                                                                                                embedded into the silicon reduces the area required and gives those functions

                                                                                                                increased speed compared to building them from primitives Examples of these

                                                                                                                include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                                logic and embedded memories

                                                                                                                61

                                                                                                                FPGAs are also widely used for systems validation including pre-silicon

                                                                                                                validation post-silicon validation and firmware development This allows chip

                                                                                                                companies to validate their design before the chip is produced in the factory

                                                                                                                reducing the time to market

                                                                                                                124 FPGA Design and Programming

                                                                                                                To define the behavior of the FPGA the user provides a hardware description

                                                                                                                language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                                with when handling large structures because its possible to just specify them

                                                                                                                numerically rather than having to draw every piece by hand On the other hand

                                                                                                                schematic entry can allow for easier visualisation of a design

                                                                                                                Then using an electronic design automation tool a technology-mapped netlist

                                                                                                                is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                                using a process called place-and-route usually performed by the FPGA

                                                                                                                companys proprietary place-and-route software The user will validate the

                                                                                                                map place and route results via timing analysis simulation and other

                                                                                                                verification methodologies Once the design and validation process is

                                                                                                                complete the binary file generated (also using the FPGA companys

                                                                                                                proprietary software) is used to (re)configure the FPGA

                                                                                                                Going from schematicHDL source files to actual configuration The source

                                                                                                                files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                                different steps will produce a file This file is then transferred to the

                                                                                                                FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                                like an EEPROM

                                                                                                                The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                                reduce the complexity of designing in HDLs which have been compared to the

                                                                                                                equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                                through the introduction of alternative languages

                                                                                                                62

                                                                                                                To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                                predefined complex functions and circuits that have been tested and optimized

                                                                                                                to speed up the design process These predefined circuits are commonly called

                                                                                                                IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                                (rarely free and typically released under proprietary licenses) Other

                                                                                                                predefined circuits are available from developer communities such as Open

                                                                                                                Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                                other sources

                                                                                                                In a typical design flow an FPGA application developer will simulate the

                                                                                                                design at multiple stages throughout the design process Initially the RTL

                                                                                                                description in VHDL or Verilog is simulated by creating test benches to

                                                                                                                simulate the system and observe results Then after the synthesis engine has

                                                                                                                mapped the design to a netlist the netlist is translated to a gate level

                                                                                                                description where simulation is repeated to confirm the synthesis proceeded

                                                                                                                without errors Finally the design is laid out in the FPGA at which point

                                                                                                                propagation delays can be added and the simulation run again with these

                                                                                                                values back-annotated onto the netlist

                                                                                                                Basic Process Technology Types

                                                                                                                SRAM - based on static memory technology In-system programmable and re-

                                                                                                                programmable Requires external boot devices CMOS

                                                                                                                Antifuse - One-time programmable CMOS

                                                                                                                EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                                one-time programmable in production because of plastic packaging

                                                                                                                Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                                EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                                technology Can be erased even in plastic packages Some but not all

                                                                                                                EEPROM devices can be in-system programmed CMOS

                                                                                                                Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                                packages Some but not all flash devices can be in-system programmed

                                                                                                                63

                                                                                                                Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                                therefore less expensive to manufacture CMOS

                                                                                                                Fuse - One-time programmable Bipolar

                                                                                                                Major Manufacturers

                                                                                                                Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                                rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                                representing over 50 percent

                                                                                                                Xilinx also provides free Windows and Linux design software while Altera

                                                                                                                provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                                a rental scheme

                                                                                                                Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                                (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                                Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                                focused CSSP no general purpose FPGAs)

                                                                                                                125 FPGA prototype

                                                                                                                FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                                prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                                hardware verification and early software development

                                                                                                                Main stream verification methods for hardware design and early software and

                                                                                                                firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                                design on FPGA has become a good method to do this

                                                                                                                64

                                                                                                                Reasons why Prototyping is important

                                                                                                                1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                functionally correct This is compared to designers only relying on software

                                                                                                                simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                modeling accuracy limitations hinder this development

                                                                                                                2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                time in area of software development and testing at the software-hardware

                                                                                                                integration stage This allows many unforeseen software bugs that appear due

                                                                                                                to todays array of operating systems applications and hardware

                                                                                                                3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                system work well together off the simulation stage and in actual form

                                                                                                                4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                bringing in interest early This speeds up the overall development cycle and

                                                                                                                allows for more enhancement or improvement to the chip features as it would

                                                                                                                otherwise have been

                                                                                                                65

                                                                                                                13 SIMULATION RESULTS

                                                                                                                66

                                                                                                                GATE LEVEL

                                                                                                                FigVGA CONTROLLER

                                                                                                                67

                                                                                                                GATE LEVEL

                                                                                                                Fig SYNC UNIT

                                                                                                                68

                                                                                                                TECHNOLOGY SCHEMATIC

                                                                                                                69

                                                                                                                14 APPLICATIONS

                                                                                                                Motion Tracking Systems

                                                                                                                Image processing systems

                                                                                                                Displaying Systems

                                                                                                                141 CONCLUTION

                                                                                                                Hence by implementing FPGA taking a processor-generated picture

                                                                                                                (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                to correctly display the picture on a connected monitor

                                                                                                                70

                                                                                                                REFERENCES

                                                                                                                1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                2 A VGA display controller by Eduardo Sanchez

                                                                                                                3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                71

                                                                                                                • Pixel
                                                                                                                  • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                  • Gates
                                                                                                                  • Market size
                                                                                                                  • 122 FPGA Comparisons

                                                                                                                  makes them far more flexible (in terms of the range of designs that are

                                                                                                                  practical for implementation within them) but also far more complex to design

                                                                                                                  for

                                                                                                                  Another notable difference between CPLDs and FPGAs is the presence in most

                                                                                                                  FPGAs of higher-level embedded functions (such as adders and multipliers)

                                                                                                                  and embedded memories as well as to have logic blocks implement decoders

                                                                                                                  or mathematical functions

                                                                                                                  Some FPGAs have the capability of partial re-configuration that lets one

                                                                                                                  portion of the device be re-programmed while other portions continue running

                                                                                                                  123 Applications

                                                                                                                  Applications of FPGAs include digital signal processing software-defined

                                                                                                                  radio aerospace and defense systems ASIC prototyping medical imaging

                                                                                                                  computer vision speech recognition cryptography bioinformatics computer

                                                                                                                  hardware emulation radio astronomy and a growing range of other areas

                                                                                                                  FPGAs originally began as competitors to CPLDs and competed in a similar

                                                                                                                  space that of glue logic for PCBs As their size capabilities and speed

                                                                                                                  increased they began to take over larger and larger functions to the state where

                                                                                                                  some are now marketed as full systems on chips (SoC) Particularly with the

                                                                                                                  introduction of dedicated multipliers into FPGA architectures in the late 1990s

                                                                                                                  applications which had traditionally been the sole reserve of DSPs began to

                                                                                                                  incorporate FPGAs instead

                                                                                                                  FPGAs especially find applications in any area or algorithm that can make use

                                                                                                                  of the massive parallelism offered by their architecture One such area is code

                                                                                                                  breaking in particular brute-force attack of cryptographic algorithms

                                                                                                                  FPGAs are increasingly used in conventional high performance computing

                                                                                                                  applications where computational kernels such as FFT or Convolution are

                                                                                                                  performed on the FPGA instead of a microprocessor

                                                                                                                  57

                                                                                                                  The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                                  considerable computational throughput even at a low MHz clock rates The

                                                                                                                  flexibility of the FPGA allows for even higher performance by trading off

                                                                                                                  precision and range in the number format for an increased number of parallel

                                                                                                                  arithmetic units This has driven a new type of processing called

                                                                                                                  reconfigurable computing where time intensive tasks are offloaded from

                                                                                                                  software to FPGAs

                                                                                                                  The adoption of FPGAs in high performance computing is currently limited by

                                                                                                                  the complexity of FPGA design compared to conventional software and the

                                                                                                                  extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                                  is necessary after even minor changes to the source code

                                                                                                                  Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                                  where the volume of production is small For these low-volume applications

                                                                                                                  the premium that companies pay in hardware costs per unit for a

                                                                                                                  programmable chip is more affordable than the development resources spent

                                                                                                                  on creating an ASIC for a low-volume application Today new cost and

                                                                                                                  performance dynamics have broadened the range of viable applications

                                                                                                                  Architecture

                                                                                                                  The most common FPGA architecture consists of an array of configurable

                                                                                                                  logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                                  channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                                  into the height of one row or the width of one column in the array

                                                                                                                  An application circuit must be mapped into an FPGA with adequate resources

                                                                                                                  While the number of CLBs and IOs required is easily determined from the

                                                                                                                  design the number of routing tracks needed may vary considerably even

                                                                                                                  among designs with the same amount of logic (For example a crossbar switch

                                                                                                                  requires much more routing than a systolic array with the same gate count)

                                                                                                                  Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                                  of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                                  58

                                                                                                                  just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                                  can be routed This is determined by estimates such as those derived from

                                                                                                                  Rents rule or by experiments with existing designs

                                                                                                                  The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                                  blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                                  width of one column as shown below All the routing channels have the same

                                                                                                                  width (number of wires)

                                                                                                                  121 FPGA structure

                                                                                                                  A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                                  flip-flop as shown below In recent years manufacturers have started moving

                                                                                                                  to 6-input LUTs in their high performance parts claiming increased

                                                                                                                  performance

                                                                                                                  Typical logic block

                                                                                                                  59

                                                                                                                  There is only one output which can be either the registered or the unregistered

                                                                                                                  LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                                  Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                                  via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                                  and other signals are separately managed

                                                                                                                  For this example architecture the locations of the FPGA logic block pins are

                                                                                                                  shown below

                                                                                                                  Logic Block Pin Locations

                                                                                                                  Each input is accessible from one side of the logic block while the output pin

                                                                                                                  can connect to routing wires in both the channel to the right and the channel

                                                                                                                  below the logic block

                                                                                                                  Each logic block output pin can connect to any of the wiring segments in the

                                                                                                                  channels adjacent to it

                                                                                                                  Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                                  channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                                  connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                                  channel immediately below it

                                                                                                                  Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                                  spans only one logic block before it terminates in a switch box By turning on

                                                                                                                  60

                                                                                                                  some of the programmable switches within a switch box longer paths can be

                                                                                                                  constructed For higher speed interconnect some FPGA architectures use

                                                                                                                  longer routing lines that span multiple logic blocks

                                                                                                                  Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                                  this architecture when a wire enters a switch box there are three

                                                                                                                  programmable switches that allow it to connect to three other wires in adjacent

                                                                                                                  channel segments The pattern or topology of switches used in this

                                                                                                                  architecture is the planar or domain-based switch box topology In this switch

                                                                                                                  box topology a wire in track number one connects only to wires in track

                                                                                                                  number one in adjacent channel segments wires in track number 2 connect

                                                                                                                  only to other wires in track number 2 and so on The figure below illustrates

                                                                                                                  the connections in a switch box

                                                                                                                  Switch box topology

                                                                                                                  Modern FPGA families expand upon the above capabilities to include higher

                                                                                                                  level functionality fixed into the silicon Having these common functions

                                                                                                                  embedded into the silicon reduces the area required and gives those functions

                                                                                                                  increased speed compared to building them from primitives Examples of these

                                                                                                                  include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                                  logic and embedded memories

                                                                                                                  61

                                                                                                                  FPGAs are also widely used for systems validation including pre-silicon

                                                                                                                  validation post-silicon validation and firmware development This allows chip

                                                                                                                  companies to validate their design before the chip is produced in the factory

                                                                                                                  reducing the time to market

                                                                                                                  124 FPGA Design and Programming

                                                                                                                  To define the behavior of the FPGA the user provides a hardware description

                                                                                                                  language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                                  with when handling large structures because its possible to just specify them

                                                                                                                  numerically rather than having to draw every piece by hand On the other hand

                                                                                                                  schematic entry can allow for easier visualisation of a design

                                                                                                                  Then using an electronic design automation tool a technology-mapped netlist

                                                                                                                  is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                                  using a process called place-and-route usually performed by the FPGA

                                                                                                                  companys proprietary place-and-route software The user will validate the

                                                                                                                  map place and route results via timing analysis simulation and other

                                                                                                                  verification methodologies Once the design and validation process is

                                                                                                                  complete the binary file generated (also using the FPGA companys

                                                                                                                  proprietary software) is used to (re)configure the FPGA

                                                                                                                  Going from schematicHDL source files to actual configuration The source

                                                                                                                  files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                                  different steps will produce a file This file is then transferred to the

                                                                                                                  FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                                  like an EEPROM

                                                                                                                  The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                                  reduce the complexity of designing in HDLs which have been compared to the

                                                                                                                  equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                                  through the introduction of alternative languages

                                                                                                                  62

                                                                                                                  To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                                  predefined complex functions and circuits that have been tested and optimized

                                                                                                                  to speed up the design process These predefined circuits are commonly called

                                                                                                                  IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                                  (rarely free and typically released under proprietary licenses) Other

                                                                                                                  predefined circuits are available from developer communities such as Open

                                                                                                                  Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                                  other sources

                                                                                                                  In a typical design flow an FPGA application developer will simulate the

                                                                                                                  design at multiple stages throughout the design process Initially the RTL

                                                                                                                  description in VHDL or Verilog is simulated by creating test benches to

                                                                                                                  simulate the system and observe results Then after the synthesis engine has

                                                                                                                  mapped the design to a netlist the netlist is translated to a gate level

                                                                                                                  description where simulation is repeated to confirm the synthesis proceeded

                                                                                                                  without errors Finally the design is laid out in the FPGA at which point

                                                                                                                  propagation delays can be added and the simulation run again with these

                                                                                                                  values back-annotated onto the netlist

                                                                                                                  Basic Process Technology Types

                                                                                                                  SRAM - based on static memory technology In-system programmable and re-

                                                                                                                  programmable Requires external boot devices CMOS

                                                                                                                  Antifuse - One-time programmable CMOS

                                                                                                                  EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                                  one-time programmable in production because of plastic packaging

                                                                                                                  Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                                  EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                                  technology Can be erased even in plastic packages Some but not all

                                                                                                                  EEPROM devices can be in-system programmed CMOS

                                                                                                                  Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                                  packages Some but not all flash devices can be in-system programmed

                                                                                                                  63

                                                                                                                  Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                                  therefore less expensive to manufacture CMOS

                                                                                                                  Fuse - One-time programmable Bipolar

                                                                                                                  Major Manufacturers

                                                                                                                  Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                                  rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                                  representing over 50 percent

                                                                                                                  Xilinx also provides free Windows and Linux design software while Altera

                                                                                                                  provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                                  a rental scheme

                                                                                                                  Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                                  (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                                  Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                                  focused CSSP no general purpose FPGAs)

                                                                                                                  125 FPGA prototype

                                                                                                                  FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                                  prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                                  hardware verification and early software development

                                                                                                                  Main stream verification methods for hardware design and early software and

                                                                                                                  firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                                  design on FPGA has become a good method to do this

                                                                                                                  64

                                                                                                                  Reasons why Prototyping is important

                                                                                                                  1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                  functionally correct This is compared to designers only relying on software

                                                                                                                  simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                  modeling accuracy limitations hinder this development

                                                                                                                  2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                  the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                  time in area of software development and testing at the software-hardware

                                                                                                                  integration stage This allows many unforeseen software bugs that appear due

                                                                                                                  to todays array of operating systems applications and hardware

                                                                                                                  3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                  system work well together off the simulation stage and in actual form

                                                                                                                  4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                  bringing in interest early This speeds up the overall development cycle and

                                                                                                                  allows for more enhancement or improvement to the chip features as it would

                                                                                                                  otherwise have been

                                                                                                                  65

                                                                                                                  13 SIMULATION RESULTS

                                                                                                                  66

                                                                                                                  GATE LEVEL

                                                                                                                  FigVGA CONTROLLER

                                                                                                                  67

                                                                                                                  GATE LEVEL

                                                                                                                  Fig SYNC UNIT

                                                                                                                  68

                                                                                                                  TECHNOLOGY SCHEMATIC

                                                                                                                  69

                                                                                                                  14 APPLICATIONS

                                                                                                                  Motion Tracking Systems

                                                                                                                  Image processing systems

                                                                                                                  Displaying Systems

                                                                                                                  141 CONCLUTION

                                                                                                                  Hence by implementing FPGA taking a processor-generated picture

                                                                                                                  (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                  each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                  to correctly display the picture on a connected monitor

                                                                                                                  70

                                                                                                                  REFERENCES

                                                                                                                  1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                  2 A VGA display controller by Eduardo Sanchez

                                                                                                                  3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                  71

                                                                                                                  • Pixel
                                                                                                                    • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                    • Gates
                                                                                                                    • Market size
                                                                                                                    • 122 FPGA Comparisons

                                                                                                                    The inherent parallelism of the logic resources on an FPGA allows for

                                                                                                                    considerable computational throughput even at a low MHz clock rates The

                                                                                                                    flexibility of the FPGA allows for even higher performance by trading off

                                                                                                                    precision and range in the number format for an increased number of parallel

                                                                                                                    arithmetic units This has driven a new type of processing called

                                                                                                                    reconfigurable computing where time intensive tasks are offloaded from

                                                                                                                    software to FPGAs

                                                                                                                    The adoption of FPGAs in high performance computing is currently limited by

                                                                                                                    the complexity of FPGA design compared to conventional software and the

                                                                                                                    extremely long turn-around times of current design tools where 4-8 hours wait

                                                                                                                    is necessary after even minor changes to the source code

                                                                                                                    Traditionally FPGAs have been reserved for specific vertical applications

                                                                                                                    where the volume of production is small For these low-volume applications

                                                                                                                    the premium that companies pay in hardware costs per unit for a

                                                                                                                    programmable chip is more affordable than the development resources spent

                                                                                                                    on creating an ASIC for a low-volume application Today new cost and

                                                                                                                    performance dynamics have broadened the range of viable applications

                                                                                                                    Architecture

                                                                                                                    The most common FPGA architecture consists of an array of configurable

                                                                                                                    logic blocks (CLBs) IO pads and routing channels Generally all the routing

                                                                                                                    channels have the same width (number of wires) Multiple IO pads may fit

                                                                                                                    into the height of one row or the width of one column in the array

                                                                                                                    An application circuit must be mapped into an FPGA with adequate resources

                                                                                                                    While the number of CLBs and IOs required is easily determined from the

                                                                                                                    design the number of routing tracks needed may vary considerably even

                                                                                                                    among designs with the same amount of logic (For example a crossbar switch

                                                                                                                    requires much more routing than a systolic array with the same gate count)

                                                                                                                    Since unused routing tracks increase the cost (and decrease the performance)

                                                                                                                    of the part without providing any benefit FPGA manufacturers try to provide

                                                                                                                    58

                                                                                                                    just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                                    can be routed This is determined by estimates such as those derived from

                                                                                                                    Rents rule or by experiments with existing designs

                                                                                                                    The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                                    blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                                    width of one column as shown below All the routing channels have the same

                                                                                                                    width (number of wires)

                                                                                                                    121 FPGA structure

                                                                                                                    A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                                    flip-flop as shown below In recent years manufacturers have started moving

                                                                                                                    to 6-input LUTs in their high performance parts claiming increased

                                                                                                                    performance

                                                                                                                    Typical logic block

                                                                                                                    59

                                                                                                                    There is only one output which can be either the registered or the unregistered

                                                                                                                    LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                                    Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                                    via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                                    and other signals are separately managed

                                                                                                                    For this example architecture the locations of the FPGA logic block pins are

                                                                                                                    shown below

                                                                                                                    Logic Block Pin Locations

                                                                                                                    Each input is accessible from one side of the logic block while the output pin

                                                                                                                    can connect to routing wires in both the channel to the right and the channel

                                                                                                                    below the logic block

                                                                                                                    Each logic block output pin can connect to any of the wiring segments in the

                                                                                                                    channels adjacent to it

                                                                                                                    Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                                    channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                                    connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                                    channel immediately below it

                                                                                                                    Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                                    spans only one logic block before it terminates in a switch box By turning on

                                                                                                                    60

                                                                                                                    some of the programmable switches within a switch box longer paths can be

                                                                                                                    constructed For higher speed interconnect some FPGA architectures use

                                                                                                                    longer routing lines that span multiple logic blocks

                                                                                                                    Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                                    this architecture when a wire enters a switch box there are three

                                                                                                                    programmable switches that allow it to connect to three other wires in adjacent

                                                                                                                    channel segments The pattern or topology of switches used in this

                                                                                                                    architecture is the planar or domain-based switch box topology In this switch

                                                                                                                    box topology a wire in track number one connects only to wires in track

                                                                                                                    number one in adjacent channel segments wires in track number 2 connect

                                                                                                                    only to other wires in track number 2 and so on The figure below illustrates

                                                                                                                    the connections in a switch box

                                                                                                                    Switch box topology

                                                                                                                    Modern FPGA families expand upon the above capabilities to include higher

                                                                                                                    level functionality fixed into the silicon Having these common functions

                                                                                                                    embedded into the silicon reduces the area required and gives those functions

                                                                                                                    increased speed compared to building them from primitives Examples of these

                                                                                                                    include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                                    logic and embedded memories

                                                                                                                    61

                                                                                                                    FPGAs are also widely used for systems validation including pre-silicon

                                                                                                                    validation post-silicon validation and firmware development This allows chip

                                                                                                                    companies to validate their design before the chip is produced in the factory

                                                                                                                    reducing the time to market

                                                                                                                    124 FPGA Design and Programming

                                                                                                                    To define the behavior of the FPGA the user provides a hardware description

                                                                                                                    language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                                    with when handling large structures because its possible to just specify them

                                                                                                                    numerically rather than having to draw every piece by hand On the other hand

                                                                                                                    schematic entry can allow for easier visualisation of a design

                                                                                                                    Then using an electronic design automation tool a technology-mapped netlist

                                                                                                                    is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                                    using a process called place-and-route usually performed by the FPGA

                                                                                                                    companys proprietary place-and-route software The user will validate the

                                                                                                                    map place and route results via timing analysis simulation and other

                                                                                                                    verification methodologies Once the design and validation process is

                                                                                                                    complete the binary file generated (also using the FPGA companys

                                                                                                                    proprietary software) is used to (re)configure the FPGA

                                                                                                                    Going from schematicHDL source files to actual configuration The source

                                                                                                                    files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                                    different steps will produce a file This file is then transferred to the

                                                                                                                    FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                                    like an EEPROM

                                                                                                                    The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                                    reduce the complexity of designing in HDLs which have been compared to the

                                                                                                                    equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                                    through the introduction of alternative languages

                                                                                                                    62

                                                                                                                    To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                                    predefined complex functions and circuits that have been tested and optimized

                                                                                                                    to speed up the design process These predefined circuits are commonly called

                                                                                                                    IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                                    (rarely free and typically released under proprietary licenses) Other

                                                                                                                    predefined circuits are available from developer communities such as Open

                                                                                                                    Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                                    other sources

                                                                                                                    In a typical design flow an FPGA application developer will simulate the

                                                                                                                    design at multiple stages throughout the design process Initially the RTL

                                                                                                                    description in VHDL or Verilog is simulated by creating test benches to

                                                                                                                    simulate the system and observe results Then after the synthesis engine has

                                                                                                                    mapped the design to a netlist the netlist is translated to a gate level

                                                                                                                    description where simulation is repeated to confirm the synthesis proceeded

                                                                                                                    without errors Finally the design is laid out in the FPGA at which point

                                                                                                                    propagation delays can be added and the simulation run again with these

                                                                                                                    values back-annotated onto the netlist

                                                                                                                    Basic Process Technology Types

                                                                                                                    SRAM - based on static memory technology In-system programmable and re-

                                                                                                                    programmable Requires external boot devices CMOS

                                                                                                                    Antifuse - One-time programmable CMOS

                                                                                                                    EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                                    one-time programmable in production because of plastic packaging

                                                                                                                    Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                                    EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                                    technology Can be erased even in plastic packages Some but not all

                                                                                                                    EEPROM devices can be in-system programmed CMOS

                                                                                                                    Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                                    packages Some but not all flash devices can be in-system programmed

                                                                                                                    63

                                                                                                                    Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                                    therefore less expensive to manufacture CMOS

                                                                                                                    Fuse - One-time programmable Bipolar

                                                                                                                    Major Manufacturers

                                                                                                                    Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                                    rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                                    representing over 50 percent

                                                                                                                    Xilinx also provides free Windows and Linux design software while Altera

                                                                                                                    provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                                    a rental scheme

                                                                                                                    Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                                    (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                                    Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                                    focused CSSP no general purpose FPGAs)

                                                                                                                    125 FPGA prototype

                                                                                                                    FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                                    prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                                    hardware verification and early software development

                                                                                                                    Main stream verification methods for hardware design and early software and

                                                                                                                    firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                                    design on FPGA has become a good method to do this

                                                                                                                    64

                                                                                                                    Reasons why Prototyping is important

                                                                                                                    1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                    functionally correct This is compared to designers only relying on software

                                                                                                                    simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                    modeling accuracy limitations hinder this development

                                                                                                                    2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                    the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                    time in area of software development and testing at the software-hardware

                                                                                                                    integration stage This allows many unforeseen software bugs that appear due

                                                                                                                    to todays array of operating systems applications and hardware

                                                                                                                    3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                    system work well together off the simulation stage and in actual form

                                                                                                                    4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                    bringing in interest early This speeds up the overall development cycle and

                                                                                                                    allows for more enhancement or improvement to the chip features as it would

                                                                                                                    otherwise have been

                                                                                                                    65

                                                                                                                    13 SIMULATION RESULTS

                                                                                                                    66

                                                                                                                    GATE LEVEL

                                                                                                                    FigVGA CONTROLLER

                                                                                                                    67

                                                                                                                    GATE LEVEL

                                                                                                                    Fig SYNC UNIT

                                                                                                                    68

                                                                                                                    TECHNOLOGY SCHEMATIC

                                                                                                                    69

                                                                                                                    14 APPLICATIONS

                                                                                                                    Motion Tracking Systems

                                                                                                                    Image processing systems

                                                                                                                    Displaying Systems

                                                                                                                    141 CONCLUTION

                                                                                                                    Hence by implementing FPGA taking a processor-generated picture

                                                                                                                    (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                    each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                    to correctly display the picture on a connected monitor

                                                                                                                    70

                                                                                                                    REFERENCES

                                                                                                                    1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                    2 A VGA display controller by Eduardo Sanchez

                                                                                                                    3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                    71

                                                                                                                    • Pixel
                                                                                                                      • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                      • Gates
                                                                                                                      • Market size
                                                                                                                      • 122 FPGA Comparisons

                                                                                                                      just enough tracks so that most designs that will fit in terms of LUTs and IOs

                                                                                                                      can be routed This is determined by estimates such as those derived from

                                                                                                                      Rents rule or by experiments with existing designs

                                                                                                                      The FPGA is an array or island-style FPGA It consists of an array of logic

                                                                                                                      blocks and routing channels Two IO pads fit into the height of one row or the

                                                                                                                      width of one column as shown below All the routing channels have the same

                                                                                                                      width (number of wires)

                                                                                                                      121 FPGA structure

                                                                                                                      A classic FPGA logic block consists of a 4-input lookup table (LUT) and a

                                                                                                                      flip-flop as shown below In recent years manufacturers have started moving

                                                                                                                      to 6-input LUTs in their high performance parts claiming increased

                                                                                                                      performance

                                                                                                                      Typical logic block

                                                                                                                      59

                                                                                                                      There is only one output which can be either the registered or the unregistered

                                                                                                                      LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                                      Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                                      via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                                      and other signals are separately managed

                                                                                                                      For this example architecture the locations of the FPGA logic block pins are

                                                                                                                      shown below

                                                                                                                      Logic Block Pin Locations

                                                                                                                      Each input is accessible from one side of the logic block while the output pin

                                                                                                                      can connect to routing wires in both the channel to the right and the channel

                                                                                                                      below the logic block

                                                                                                                      Each logic block output pin can connect to any of the wiring segments in the

                                                                                                                      channels adjacent to it

                                                                                                                      Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                                      channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                                      connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                                      channel immediately below it

                                                                                                                      Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                                      spans only one logic block before it terminates in a switch box By turning on

                                                                                                                      60

                                                                                                                      some of the programmable switches within a switch box longer paths can be

                                                                                                                      constructed For higher speed interconnect some FPGA architectures use

                                                                                                                      longer routing lines that span multiple logic blocks

                                                                                                                      Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                                      this architecture when a wire enters a switch box there are three

                                                                                                                      programmable switches that allow it to connect to three other wires in adjacent

                                                                                                                      channel segments The pattern or topology of switches used in this

                                                                                                                      architecture is the planar or domain-based switch box topology In this switch

                                                                                                                      box topology a wire in track number one connects only to wires in track

                                                                                                                      number one in adjacent channel segments wires in track number 2 connect

                                                                                                                      only to other wires in track number 2 and so on The figure below illustrates

                                                                                                                      the connections in a switch box

                                                                                                                      Switch box topology

                                                                                                                      Modern FPGA families expand upon the above capabilities to include higher

                                                                                                                      level functionality fixed into the silicon Having these common functions

                                                                                                                      embedded into the silicon reduces the area required and gives those functions

                                                                                                                      increased speed compared to building them from primitives Examples of these

                                                                                                                      include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                                      logic and embedded memories

                                                                                                                      61

                                                                                                                      FPGAs are also widely used for systems validation including pre-silicon

                                                                                                                      validation post-silicon validation and firmware development This allows chip

                                                                                                                      companies to validate their design before the chip is produced in the factory

                                                                                                                      reducing the time to market

                                                                                                                      124 FPGA Design and Programming

                                                                                                                      To define the behavior of the FPGA the user provides a hardware description

                                                                                                                      language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                                      with when handling large structures because its possible to just specify them

                                                                                                                      numerically rather than having to draw every piece by hand On the other hand

                                                                                                                      schematic entry can allow for easier visualisation of a design

                                                                                                                      Then using an electronic design automation tool a technology-mapped netlist

                                                                                                                      is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                                      using a process called place-and-route usually performed by the FPGA

                                                                                                                      companys proprietary place-and-route software The user will validate the

                                                                                                                      map place and route results via timing analysis simulation and other

                                                                                                                      verification methodologies Once the design and validation process is

                                                                                                                      complete the binary file generated (also using the FPGA companys

                                                                                                                      proprietary software) is used to (re)configure the FPGA

                                                                                                                      Going from schematicHDL source files to actual configuration The source

                                                                                                                      files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                                      different steps will produce a file This file is then transferred to the

                                                                                                                      FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                                      like an EEPROM

                                                                                                                      The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                                      reduce the complexity of designing in HDLs which have been compared to the

                                                                                                                      equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                                      through the introduction of alternative languages

                                                                                                                      62

                                                                                                                      To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                                      predefined complex functions and circuits that have been tested and optimized

                                                                                                                      to speed up the design process These predefined circuits are commonly called

                                                                                                                      IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                                      (rarely free and typically released under proprietary licenses) Other

                                                                                                                      predefined circuits are available from developer communities such as Open

                                                                                                                      Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                                      other sources

                                                                                                                      In a typical design flow an FPGA application developer will simulate the

                                                                                                                      design at multiple stages throughout the design process Initially the RTL

                                                                                                                      description in VHDL or Verilog is simulated by creating test benches to

                                                                                                                      simulate the system and observe results Then after the synthesis engine has

                                                                                                                      mapped the design to a netlist the netlist is translated to a gate level

                                                                                                                      description where simulation is repeated to confirm the synthesis proceeded

                                                                                                                      without errors Finally the design is laid out in the FPGA at which point

                                                                                                                      propagation delays can be added and the simulation run again with these

                                                                                                                      values back-annotated onto the netlist

                                                                                                                      Basic Process Technology Types

                                                                                                                      SRAM - based on static memory technology In-system programmable and re-

                                                                                                                      programmable Requires external boot devices CMOS

                                                                                                                      Antifuse - One-time programmable CMOS

                                                                                                                      EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                                      one-time programmable in production because of plastic packaging

                                                                                                                      Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                                      EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                                      technology Can be erased even in plastic packages Some but not all

                                                                                                                      EEPROM devices can be in-system programmed CMOS

                                                                                                                      Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                                      packages Some but not all flash devices can be in-system programmed

                                                                                                                      63

                                                                                                                      Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                                      therefore less expensive to manufacture CMOS

                                                                                                                      Fuse - One-time programmable Bipolar

                                                                                                                      Major Manufacturers

                                                                                                                      Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                                      rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                                      representing over 50 percent

                                                                                                                      Xilinx also provides free Windows and Linux design software while Altera

                                                                                                                      provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                                      a rental scheme

                                                                                                                      Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                                      (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                                      Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                                      focused CSSP no general purpose FPGAs)

                                                                                                                      125 FPGA prototype

                                                                                                                      FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                                      prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                                      hardware verification and early software development

                                                                                                                      Main stream verification methods for hardware design and early software and

                                                                                                                      firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                                      design on FPGA has become a good method to do this

                                                                                                                      64

                                                                                                                      Reasons why Prototyping is important

                                                                                                                      1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                      functionally correct This is compared to designers only relying on software

                                                                                                                      simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                      modeling accuracy limitations hinder this development

                                                                                                                      2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                      the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                      time in area of software development and testing at the software-hardware

                                                                                                                      integration stage This allows many unforeseen software bugs that appear due

                                                                                                                      to todays array of operating systems applications and hardware

                                                                                                                      3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                      system work well together off the simulation stage and in actual form

                                                                                                                      4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                      bringing in interest early This speeds up the overall development cycle and

                                                                                                                      allows for more enhancement or improvement to the chip features as it would

                                                                                                                      otherwise have been

                                                                                                                      65

                                                                                                                      13 SIMULATION RESULTS

                                                                                                                      66

                                                                                                                      GATE LEVEL

                                                                                                                      FigVGA CONTROLLER

                                                                                                                      67

                                                                                                                      GATE LEVEL

                                                                                                                      Fig SYNC UNIT

                                                                                                                      68

                                                                                                                      TECHNOLOGY SCHEMATIC

                                                                                                                      69

                                                                                                                      14 APPLICATIONS

                                                                                                                      Motion Tracking Systems

                                                                                                                      Image processing systems

                                                                                                                      Displaying Systems

                                                                                                                      141 CONCLUTION

                                                                                                                      Hence by implementing FPGA taking a processor-generated picture

                                                                                                                      (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                      each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                      to correctly display the picture on a connected monitor

                                                                                                                      70

                                                                                                                      REFERENCES

                                                                                                                      1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                      2 A VGA display controller by Eduardo Sanchez

                                                                                                                      3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                      71

                                                                                                                      • Pixel
                                                                                                                        • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                        • Gates
                                                                                                                        • Market size
                                                                                                                        • 122 FPGA Comparisons

                                                                                                                        There is only one output which can be either the registered or the unregistered

                                                                                                                        LUT output The logic block has four inputs for the LUT and a clock input

                                                                                                                        Since clock signals (and often other high-fanout signals) are normally routed

                                                                                                                        via special-purpose dedicated routing networks in commercial FPGAs they

                                                                                                                        and other signals are separately managed

                                                                                                                        For this example architecture the locations of the FPGA logic block pins are

                                                                                                                        shown below

                                                                                                                        Logic Block Pin Locations

                                                                                                                        Each input is accessible from one side of the logic block while the output pin

                                                                                                                        can connect to routing wires in both the channel to the right and the channel

                                                                                                                        below the logic block

                                                                                                                        Each logic block output pin can connect to any of the wiring segments in the

                                                                                                                        channels adjacent to it

                                                                                                                        Similarly an IO pad can connect to any one of the wiring segments in the

                                                                                                                        channel adjacent to it For example an IO pad at the top of the chip can

                                                                                                                        connect to any of the W wires (where W is the channel width) in the horizontal

                                                                                                                        channel immediately below it

                                                                                                                        Generally the FPGA routing is unsegmented That is each wiring segment

                                                                                                                        spans only one logic block before it terminates in a switch box By turning on

                                                                                                                        60

                                                                                                                        some of the programmable switches within a switch box longer paths can be

                                                                                                                        constructed For higher speed interconnect some FPGA architectures use

                                                                                                                        longer routing lines that span multiple logic blocks

                                                                                                                        Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                                        this architecture when a wire enters a switch box there are three

                                                                                                                        programmable switches that allow it to connect to three other wires in adjacent

                                                                                                                        channel segments The pattern or topology of switches used in this

                                                                                                                        architecture is the planar or domain-based switch box topology In this switch

                                                                                                                        box topology a wire in track number one connects only to wires in track

                                                                                                                        number one in adjacent channel segments wires in track number 2 connect

                                                                                                                        only to other wires in track number 2 and so on The figure below illustrates

                                                                                                                        the connections in a switch box

                                                                                                                        Switch box topology

                                                                                                                        Modern FPGA families expand upon the above capabilities to include higher

                                                                                                                        level functionality fixed into the silicon Having these common functions

                                                                                                                        embedded into the silicon reduces the area required and gives those functions

                                                                                                                        increased speed compared to building them from primitives Examples of these

                                                                                                                        include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                                        logic and embedded memories

                                                                                                                        61

                                                                                                                        FPGAs are also widely used for systems validation including pre-silicon

                                                                                                                        validation post-silicon validation and firmware development This allows chip

                                                                                                                        companies to validate their design before the chip is produced in the factory

                                                                                                                        reducing the time to market

                                                                                                                        124 FPGA Design and Programming

                                                                                                                        To define the behavior of the FPGA the user provides a hardware description

                                                                                                                        language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                                        with when handling large structures because its possible to just specify them

                                                                                                                        numerically rather than having to draw every piece by hand On the other hand

                                                                                                                        schematic entry can allow for easier visualisation of a design

                                                                                                                        Then using an electronic design automation tool a technology-mapped netlist

                                                                                                                        is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                                        using a process called place-and-route usually performed by the FPGA

                                                                                                                        companys proprietary place-and-route software The user will validate the

                                                                                                                        map place and route results via timing analysis simulation and other

                                                                                                                        verification methodologies Once the design and validation process is

                                                                                                                        complete the binary file generated (also using the FPGA companys

                                                                                                                        proprietary software) is used to (re)configure the FPGA

                                                                                                                        Going from schematicHDL source files to actual configuration The source

                                                                                                                        files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                                        different steps will produce a file This file is then transferred to the

                                                                                                                        FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                                        like an EEPROM

                                                                                                                        The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                                        reduce the complexity of designing in HDLs which have been compared to the

                                                                                                                        equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                                        through the introduction of alternative languages

                                                                                                                        62

                                                                                                                        To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                                        predefined complex functions and circuits that have been tested and optimized

                                                                                                                        to speed up the design process These predefined circuits are commonly called

                                                                                                                        IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                                        (rarely free and typically released under proprietary licenses) Other

                                                                                                                        predefined circuits are available from developer communities such as Open

                                                                                                                        Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                                        other sources

                                                                                                                        In a typical design flow an FPGA application developer will simulate the

                                                                                                                        design at multiple stages throughout the design process Initially the RTL

                                                                                                                        description in VHDL or Verilog is simulated by creating test benches to

                                                                                                                        simulate the system and observe results Then after the synthesis engine has

                                                                                                                        mapped the design to a netlist the netlist is translated to a gate level

                                                                                                                        description where simulation is repeated to confirm the synthesis proceeded

                                                                                                                        without errors Finally the design is laid out in the FPGA at which point

                                                                                                                        propagation delays can be added and the simulation run again with these

                                                                                                                        values back-annotated onto the netlist

                                                                                                                        Basic Process Technology Types

                                                                                                                        SRAM - based on static memory technology In-system programmable and re-

                                                                                                                        programmable Requires external boot devices CMOS

                                                                                                                        Antifuse - One-time programmable CMOS

                                                                                                                        EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                                        one-time programmable in production because of plastic packaging

                                                                                                                        Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                                        EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                                        technology Can be erased even in plastic packages Some but not all

                                                                                                                        EEPROM devices can be in-system programmed CMOS

                                                                                                                        Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                                        packages Some but not all flash devices can be in-system programmed

                                                                                                                        63

                                                                                                                        Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                                        therefore less expensive to manufacture CMOS

                                                                                                                        Fuse - One-time programmable Bipolar

                                                                                                                        Major Manufacturers

                                                                                                                        Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                                        rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                                        representing over 50 percent

                                                                                                                        Xilinx also provides free Windows and Linux design software while Altera

                                                                                                                        provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                                        a rental scheme

                                                                                                                        Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                                        (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                                        Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                                        focused CSSP no general purpose FPGAs)

                                                                                                                        125 FPGA prototype

                                                                                                                        FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                                        prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                                        hardware verification and early software development

                                                                                                                        Main stream verification methods for hardware design and early software and

                                                                                                                        firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                                        design on FPGA has become a good method to do this

                                                                                                                        64

                                                                                                                        Reasons why Prototyping is important

                                                                                                                        1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                        functionally correct This is compared to designers only relying on software

                                                                                                                        simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                        modeling accuracy limitations hinder this development

                                                                                                                        2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                        the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                        time in area of software development and testing at the software-hardware

                                                                                                                        integration stage This allows many unforeseen software bugs that appear due

                                                                                                                        to todays array of operating systems applications and hardware

                                                                                                                        3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                        system work well together off the simulation stage and in actual form

                                                                                                                        4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                        bringing in interest early This speeds up the overall development cycle and

                                                                                                                        allows for more enhancement or improvement to the chip features as it would

                                                                                                                        otherwise have been

                                                                                                                        65

                                                                                                                        13 SIMULATION RESULTS

                                                                                                                        66

                                                                                                                        GATE LEVEL

                                                                                                                        FigVGA CONTROLLER

                                                                                                                        67

                                                                                                                        GATE LEVEL

                                                                                                                        Fig SYNC UNIT

                                                                                                                        68

                                                                                                                        TECHNOLOGY SCHEMATIC

                                                                                                                        69

                                                                                                                        14 APPLICATIONS

                                                                                                                        Motion Tracking Systems

                                                                                                                        Image processing systems

                                                                                                                        Displaying Systems

                                                                                                                        141 CONCLUTION

                                                                                                                        Hence by implementing FPGA taking a processor-generated picture

                                                                                                                        (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                        each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                        to correctly display the picture on a connected monitor

                                                                                                                        70

                                                                                                                        REFERENCES

                                                                                                                        1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                        2 A VGA display controller by Eduardo Sanchez

                                                                                                                        3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                        71

                                                                                                                        • Pixel
                                                                                                                          • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                          • Gates
                                                                                                                          • Market size
                                                                                                                          • 122 FPGA Comparisons

                                                                                                                          some of the programmable switches within a switch box longer paths can be

                                                                                                                          constructed For higher speed interconnect some FPGA architectures use

                                                                                                                          longer routing lines that span multiple logic blocks

                                                                                                                          Whenever a vertical and a horizontal channel intersect there is a switch box In

                                                                                                                          this architecture when a wire enters a switch box there are three

                                                                                                                          programmable switches that allow it to connect to three other wires in adjacent

                                                                                                                          channel segments The pattern or topology of switches used in this

                                                                                                                          architecture is the planar or domain-based switch box topology In this switch

                                                                                                                          box topology a wire in track number one connects only to wires in track

                                                                                                                          number one in adjacent channel segments wires in track number 2 connect

                                                                                                                          only to other wires in track number 2 and so on The figure below illustrates

                                                                                                                          the connections in a switch box

                                                                                                                          Switch box topology

                                                                                                                          Modern FPGA families expand upon the above capabilities to include higher

                                                                                                                          level functionality fixed into the silicon Having these common functions

                                                                                                                          embedded into the silicon reduces the area required and gives those functions

                                                                                                                          increased speed compared to building them from primitives Examples of these

                                                                                                                          include multipliers generic DSP blocks embedded processors high speed IO

                                                                                                                          logic and embedded memories

                                                                                                                          61

                                                                                                                          FPGAs are also widely used for systems validation including pre-silicon

                                                                                                                          validation post-silicon validation and firmware development This allows chip

                                                                                                                          companies to validate their design before the chip is produced in the factory

                                                                                                                          reducing the time to market

                                                                                                                          124 FPGA Design and Programming

                                                                                                                          To define the behavior of the FPGA the user provides a hardware description

                                                                                                                          language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                                          with when handling large structures because its possible to just specify them

                                                                                                                          numerically rather than having to draw every piece by hand On the other hand

                                                                                                                          schematic entry can allow for easier visualisation of a design

                                                                                                                          Then using an electronic design automation tool a technology-mapped netlist

                                                                                                                          is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                                          using a process called place-and-route usually performed by the FPGA

                                                                                                                          companys proprietary place-and-route software The user will validate the

                                                                                                                          map place and route results via timing analysis simulation and other

                                                                                                                          verification methodologies Once the design and validation process is

                                                                                                                          complete the binary file generated (also using the FPGA companys

                                                                                                                          proprietary software) is used to (re)configure the FPGA

                                                                                                                          Going from schematicHDL source files to actual configuration The source

                                                                                                                          files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                                          different steps will produce a file This file is then transferred to the

                                                                                                                          FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                                          like an EEPROM

                                                                                                                          The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                                          reduce the complexity of designing in HDLs which have been compared to the

                                                                                                                          equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                                          through the introduction of alternative languages

                                                                                                                          62

                                                                                                                          To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                                          predefined complex functions and circuits that have been tested and optimized

                                                                                                                          to speed up the design process These predefined circuits are commonly called

                                                                                                                          IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                                          (rarely free and typically released under proprietary licenses) Other

                                                                                                                          predefined circuits are available from developer communities such as Open

                                                                                                                          Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                                          other sources

                                                                                                                          In a typical design flow an FPGA application developer will simulate the

                                                                                                                          design at multiple stages throughout the design process Initially the RTL

                                                                                                                          description in VHDL or Verilog is simulated by creating test benches to

                                                                                                                          simulate the system and observe results Then after the synthesis engine has

                                                                                                                          mapped the design to a netlist the netlist is translated to a gate level

                                                                                                                          description where simulation is repeated to confirm the synthesis proceeded

                                                                                                                          without errors Finally the design is laid out in the FPGA at which point

                                                                                                                          propagation delays can be added and the simulation run again with these

                                                                                                                          values back-annotated onto the netlist

                                                                                                                          Basic Process Technology Types

                                                                                                                          SRAM - based on static memory technology In-system programmable and re-

                                                                                                                          programmable Requires external boot devices CMOS

                                                                                                                          Antifuse - One-time programmable CMOS

                                                                                                                          EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                                          one-time programmable in production because of plastic packaging

                                                                                                                          Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                                          EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                                          technology Can be erased even in plastic packages Some but not all

                                                                                                                          EEPROM devices can be in-system programmed CMOS

                                                                                                                          Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                                          packages Some but not all flash devices can be in-system programmed

                                                                                                                          63

                                                                                                                          Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                                          therefore less expensive to manufacture CMOS

                                                                                                                          Fuse - One-time programmable Bipolar

                                                                                                                          Major Manufacturers

                                                                                                                          Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                                          rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                                          representing over 50 percent

                                                                                                                          Xilinx also provides free Windows and Linux design software while Altera

                                                                                                                          provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                                          a rental scheme

                                                                                                                          Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                                          (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                                          Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                                          focused CSSP no general purpose FPGAs)

                                                                                                                          125 FPGA prototype

                                                                                                                          FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                                          prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                                          hardware verification and early software development

                                                                                                                          Main stream verification methods for hardware design and early software and

                                                                                                                          firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                                          design on FPGA has become a good method to do this

                                                                                                                          64

                                                                                                                          Reasons why Prototyping is important

                                                                                                                          1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                          functionally correct This is compared to designers only relying on software

                                                                                                                          simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                          modeling accuracy limitations hinder this development

                                                                                                                          2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                          the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                          time in area of software development and testing at the software-hardware

                                                                                                                          integration stage This allows many unforeseen software bugs that appear due

                                                                                                                          to todays array of operating systems applications and hardware

                                                                                                                          3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                          system work well together off the simulation stage and in actual form

                                                                                                                          4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                          bringing in interest early This speeds up the overall development cycle and

                                                                                                                          allows for more enhancement or improvement to the chip features as it would

                                                                                                                          otherwise have been

                                                                                                                          65

                                                                                                                          13 SIMULATION RESULTS

                                                                                                                          66

                                                                                                                          GATE LEVEL

                                                                                                                          FigVGA CONTROLLER

                                                                                                                          67

                                                                                                                          GATE LEVEL

                                                                                                                          Fig SYNC UNIT

                                                                                                                          68

                                                                                                                          TECHNOLOGY SCHEMATIC

                                                                                                                          69

                                                                                                                          14 APPLICATIONS

                                                                                                                          Motion Tracking Systems

                                                                                                                          Image processing systems

                                                                                                                          Displaying Systems

                                                                                                                          141 CONCLUTION

                                                                                                                          Hence by implementing FPGA taking a processor-generated picture

                                                                                                                          (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                          each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                          to correctly display the picture on a connected monitor

                                                                                                                          70

                                                                                                                          REFERENCES

                                                                                                                          1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                          2 A VGA display controller by Eduardo Sanchez

                                                                                                                          3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                          71

                                                                                                                          • Pixel
                                                                                                                            • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                            • Gates
                                                                                                                            • Market size
                                                                                                                            • 122 FPGA Comparisons

                                                                                                                            FPGAs are also widely used for systems validation including pre-silicon

                                                                                                                            validation post-silicon validation and firmware development This allows chip

                                                                                                                            companies to validate their design before the chip is produced in the factory

                                                                                                                            reducing the time to market

                                                                                                                            124 FPGA Design and Programming

                                                                                                                            To define the behavior of the FPGA the user provides a hardware description

                                                                                                                            language (HDL) or a schematic design The HDL form might be easier to work

                                                                                                                            with when handling large structures because its possible to just specify them

                                                                                                                            numerically rather than having to draw every piece by hand On the other hand

                                                                                                                            schematic entry can allow for easier visualisation of a design

                                                                                                                            Then using an electronic design automation tool a technology-mapped netlist

                                                                                                                            is generated The netlist can then be fitted to the actual FPGA architecture

                                                                                                                            using a process called place-and-route usually performed by the FPGA

                                                                                                                            companys proprietary place-and-route software The user will validate the

                                                                                                                            map place and route results via timing analysis simulation and other

                                                                                                                            verification methodologies Once the design and validation process is

                                                                                                                            complete the binary file generated (also using the FPGA companys

                                                                                                                            proprietary software) is used to (re)configure the FPGA

                                                                                                                            Going from schematicHDL source files to actual configuration The source

                                                                                                                            files are fed to a software suite from the FPGACPLD vendor that through

                                                                                                                            different steps will produce a file This file is then transferred to the

                                                                                                                            FPGACPLD via a serial interface (JTAG) or to an external memory device

                                                                                                                            like an EEPROM

                                                                                                                            The most common HDLs are VHDL and Verilog although in an attempt to

                                                                                                                            reduce the complexity of designing in HDLs which have been compared to the

                                                                                                                            equivalent of assembly languages there are moves to raise the abstraction level

                                                                                                                            through the introduction of alternative languages

                                                                                                                            62

                                                                                                                            To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                                            predefined complex functions and circuits that have been tested and optimized

                                                                                                                            to speed up the design process These predefined circuits are commonly called

                                                                                                                            IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                                            (rarely free and typically released under proprietary licenses) Other

                                                                                                                            predefined circuits are available from developer communities such as Open

                                                                                                                            Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                                            other sources

                                                                                                                            In a typical design flow an FPGA application developer will simulate the

                                                                                                                            design at multiple stages throughout the design process Initially the RTL

                                                                                                                            description in VHDL or Verilog is simulated by creating test benches to

                                                                                                                            simulate the system and observe results Then after the synthesis engine has

                                                                                                                            mapped the design to a netlist the netlist is translated to a gate level

                                                                                                                            description where simulation is repeated to confirm the synthesis proceeded

                                                                                                                            without errors Finally the design is laid out in the FPGA at which point

                                                                                                                            propagation delays can be added and the simulation run again with these

                                                                                                                            values back-annotated onto the netlist

                                                                                                                            Basic Process Technology Types

                                                                                                                            SRAM - based on static memory technology In-system programmable and re-

                                                                                                                            programmable Requires external boot devices CMOS

                                                                                                                            Antifuse - One-time programmable CMOS

                                                                                                                            EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                                            one-time programmable in production because of plastic packaging

                                                                                                                            Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                                            EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                                            technology Can be erased even in plastic packages Some but not all

                                                                                                                            EEPROM devices can be in-system programmed CMOS

                                                                                                                            Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                                            packages Some but not all flash devices can be in-system programmed

                                                                                                                            63

                                                                                                                            Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                                            therefore less expensive to manufacture CMOS

                                                                                                                            Fuse - One-time programmable Bipolar

                                                                                                                            Major Manufacturers

                                                                                                                            Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                                            rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                                            representing over 50 percent

                                                                                                                            Xilinx also provides free Windows and Linux design software while Altera

                                                                                                                            provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                                            a rental scheme

                                                                                                                            Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                                            (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                                            Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                                            focused CSSP no general purpose FPGAs)

                                                                                                                            125 FPGA prototype

                                                                                                                            FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                                            prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                                            hardware verification and early software development

                                                                                                                            Main stream verification methods for hardware design and early software and

                                                                                                                            firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                                            design on FPGA has become a good method to do this

                                                                                                                            64

                                                                                                                            Reasons why Prototyping is important

                                                                                                                            1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                            functionally correct This is compared to designers only relying on software

                                                                                                                            simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                            modeling accuracy limitations hinder this development

                                                                                                                            2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                            the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                            time in area of software development and testing at the software-hardware

                                                                                                                            integration stage This allows many unforeseen software bugs that appear due

                                                                                                                            to todays array of operating systems applications and hardware

                                                                                                                            3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                            system work well together off the simulation stage and in actual form

                                                                                                                            4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                            bringing in interest early This speeds up the overall development cycle and

                                                                                                                            allows for more enhancement or improvement to the chip features as it would

                                                                                                                            otherwise have been

                                                                                                                            65

                                                                                                                            13 SIMULATION RESULTS

                                                                                                                            66

                                                                                                                            GATE LEVEL

                                                                                                                            FigVGA CONTROLLER

                                                                                                                            67

                                                                                                                            GATE LEVEL

                                                                                                                            Fig SYNC UNIT

                                                                                                                            68

                                                                                                                            TECHNOLOGY SCHEMATIC

                                                                                                                            69

                                                                                                                            14 APPLICATIONS

                                                                                                                            Motion Tracking Systems

                                                                                                                            Image processing systems

                                                                                                                            Displaying Systems

                                                                                                                            141 CONCLUTION

                                                                                                                            Hence by implementing FPGA taking a processor-generated picture

                                                                                                                            (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                            each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                            to correctly display the picture on a connected monitor

                                                                                                                            70

                                                                                                                            REFERENCES

                                                                                                                            1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                            2 A VGA display controller by Eduardo Sanchez

                                                                                                                            3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                            71

                                                                                                                            • Pixel
                                                                                                                              • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                              • Gates
                                                                                                                              • Market size
                                                                                                                              • 122 FPGA Comparisons

                                                                                                                              To simplify the design of complex systems in FPGAs there exist libraries of

                                                                                                                              predefined complex functions and circuits that have been tested and optimized

                                                                                                                              to speed up the design process These predefined circuits are commonly called

                                                                                                                              IP cores and are available from FPGA vendors and third-party IP suppliers

                                                                                                                              (rarely free and typically released under proprietary licenses) Other

                                                                                                                              predefined circuits are available from developer communities such as Open

                                                                                                                              Cores (typically free and released under the GPL BSD or similar license) and

                                                                                                                              other sources

                                                                                                                              In a typical design flow an FPGA application developer will simulate the

                                                                                                                              design at multiple stages throughout the design process Initially the RTL

                                                                                                                              description in VHDL or Verilog is simulated by creating test benches to

                                                                                                                              simulate the system and observe results Then after the synthesis engine has

                                                                                                                              mapped the design to a netlist the netlist is translated to a gate level

                                                                                                                              description where simulation is repeated to confirm the synthesis proceeded

                                                                                                                              without errors Finally the design is laid out in the FPGA at which point

                                                                                                                              propagation delays can be added and the simulation run again with these

                                                                                                                              values back-annotated onto the netlist

                                                                                                                              Basic Process Technology Types

                                                                                                                              SRAM - based on static memory technology In-system programmable and re-

                                                                                                                              programmable Requires external boot devices CMOS

                                                                                                                              Antifuse - One-time programmable CMOS

                                                                                                                              EPROM - Erasable Programmable Read-Only Memory technology Usually

                                                                                                                              one-time programmable in production because of plastic packaging

                                                                                                                              Windowed devices can be erased with ultraviolet (UV) light CMOS

                                                                                                                              EEPROM - Electrically Erasable Programmable Read-Only Memory

                                                                                                                              technology Can be erased even in plastic packages Some but not all

                                                                                                                              EEPROM devices can be in-system programmed CMOS

                                                                                                                              Flash - Flash-erase EPROM technology Can be erased even in plastic

                                                                                                                              packages Some but not all flash devices can be in-system programmed

                                                                                                                              63

                                                                                                                              Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                                              therefore less expensive to manufacture CMOS

                                                                                                                              Fuse - One-time programmable Bipolar

                                                                                                                              Major Manufacturers

                                                                                                                              Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                                              rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                                              representing over 50 percent

                                                                                                                              Xilinx also provides free Windows and Linux design software while Altera

                                                                                                                              provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                                              a rental scheme

                                                                                                                              Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                                              (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                                              Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                                              focused CSSP no general purpose FPGAs)

                                                                                                                              125 FPGA prototype

                                                                                                                              FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                                              prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                                              hardware verification and early software development

                                                                                                                              Main stream verification methods for hardware design and early software and

                                                                                                                              firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                                              design on FPGA has become a good method to do this

                                                                                                                              64

                                                                                                                              Reasons why Prototyping is important

                                                                                                                              1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                              functionally correct This is compared to designers only relying on software

                                                                                                                              simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                              modeling accuracy limitations hinder this development

                                                                                                                              2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                              the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                              time in area of software development and testing at the software-hardware

                                                                                                                              integration stage This allows many unforeseen software bugs that appear due

                                                                                                                              to todays array of operating systems applications and hardware

                                                                                                                              3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                              system work well together off the simulation stage and in actual form

                                                                                                                              4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                              bringing in interest early This speeds up the overall development cycle and

                                                                                                                              allows for more enhancement or improvement to the chip features as it would

                                                                                                                              otherwise have been

                                                                                                                              65

                                                                                                                              13 SIMULATION RESULTS

                                                                                                                              66

                                                                                                                              GATE LEVEL

                                                                                                                              FigVGA CONTROLLER

                                                                                                                              67

                                                                                                                              GATE LEVEL

                                                                                                                              Fig SYNC UNIT

                                                                                                                              68

                                                                                                                              TECHNOLOGY SCHEMATIC

                                                                                                                              69

                                                                                                                              14 APPLICATIONS

                                                                                                                              Motion Tracking Systems

                                                                                                                              Image processing systems

                                                                                                                              Displaying Systems

                                                                                                                              141 CONCLUTION

                                                                                                                              Hence by implementing FPGA taking a processor-generated picture

                                                                                                                              (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                              each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                              to correctly display the picture on a connected monitor

                                                                                                                              70

                                                                                                                              REFERENCES

                                                                                                                              1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                              2 A VGA display controller by Eduardo Sanchez

                                                                                                                              3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                              71

                                                                                                                              • Pixel
                                                                                                                                • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                                • Gates
                                                                                                                                • Market size
                                                                                                                                • 122 FPGA Comparisons

                                                                                                                                Usually a flash cell is smaller than an equivalent EEPROM cell and is

                                                                                                                                therefore less expensive to manufacture CMOS

                                                                                                                                Fuse - One-time programmable Bipolar

                                                                                                                                Major Manufacturers

                                                                                                                                Xilinx and Altera are the current FPGA market leaders and long-time industry

                                                                                                                                rivals Together they control over 80 percent of the market with Xilinx alone

                                                                                                                                representing over 50 percent

                                                                                                                                Xilinx also provides free Windows and Linux design software while Altera

                                                                                                                                provides free Windows tools the Solaris and Linux tools are only available via

                                                                                                                                a rental scheme

                                                                                                                                Other competitors include Lattice Semiconductor (flash SRAM) Actel

                                                                                                                                (antifuse flash-based mixed-signal) SiliconBlue Technologies (low power)

                                                                                                                                Achronix (RAM based 15GHz fabric speed) and QuickLogic (handheld

                                                                                                                                focused CSSP no general purpose FPGAs)

                                                                                                                                125 FPGA prototype

                                                                                                                                FPGA prototyping sometimes also referred to as ASIC prototyping or SoC

                                                                                                                                prototyping is the method to prototype SoC and ASIC design on FPGA for

                                                                                                                                hardware verification and early software development

                                                                                                                                Main stream verification methods for hardware design and early software and

                                                                                                                                firmware co-design has become mainstream Prototyping SoC and ASIC

                                                                                                                                design on FPGA has become a good method to do this

                                                                                                                                64

                                                                                                                                Reasons why Prototyping is important

                                                                                                                                1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                                functionally correct This is compared to designers only relying on software

                                                                                                                                simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                                modeling accuracy limitations hinder this development

                                                                                                                                2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                                the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                                time in area of software development and testing at the software-hardware

                                                                                                                                integration stage This allows many unforeseen software bugs that appear due

                                                                                                                                to todays array of operating systems applications and hardware

                                                                                                                                3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                                system work well together off the simulation stage and in actual form

                                                                                                                                4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                                bringing in interest early This speeds up the overall development cycle and

                                                                                                                                allows for more enhancement or improvement to the chip features as it would

                                                                                                                                otherwise have been

                                                                                                                                65

                                                                                                                                13 SIMULATION RESULTS

                                                                                                                                66

                                                                                                                                GATE LEVEL

                                                                                                                                FigVGA CONTROLLER

                                                                                                                                67

                                                                                                                                GATE LEVEL

                                                                                                                                Fig SYNC UNIT

                                                                                                                                68

                                                                                                                                TECHNOLOGY SCHEMATIC

                                                                                                                                69

                                                                                                                                14 APPLICATIONS

                                                                                                                                Motion Tracking Systems

                                                                                                                                Image processing systems

                                                                                                                                Displaying Systems

                                                                                                                                141 CONCLUTION

                                                                                                                                Hence by implementing FPGA taking a processor-generated picture

                                                                                                                                (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                                each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                                to correctly display the picture on a connected monitor

                                                                                                                                70

                                                                                                                                REFERENCES

                                                                                                                                1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                                2 A VGA display controller by Eduardo Sanchez

                                                                                                                                3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                                71

                                                                                                                                • Pixel
                                                                                                                                  • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                                  • Gates
                                                                                                                                  • Market size
                                                                                                                                  • 122 FPGA Comparisons

                                                                                                                                  Reasons why Prototyping is important

                                                                                                                                  1 Running a SoC design on FPGA prototype is a reliable way to ensure that it is

                                                                                                                                  functionally correct This is compared to designers only relying on software

                                                                                                                                  simulations to verify that their hardware design is sound Simulation speed and

                                                                                                                                  modeling accuracy limitations hinder this development

                                                                                                                                  2 Due to time constrains many projects cannot wait until the silicon is back from

                                                                                                                                  the foundry to start on software tests FPGA prototyping allows for much more

                                                                                                                                  time in area of software development and testing at the software-hardware

                                                                                                                                  integration stage This allows many unforeseen software bugs that appear due

                                                                                                                                  to todays array of operating systems applications and hardware

                                                                                                                                  3 Prototyping also allows the developer to ensure that all IP technologies on his

                                                                                                                                  system work well together off the simulation stage and in actual form

                                                                                                                                  4 Prototyping has the added advantage as demo platforms to SoC clients

                                                                                                                                  bringing in interest early This speeds up the overall development cycle and

                                                                                                                                  allows for more enhancement or improvement to the chip features as it would

                                                                                                                                  otherwise have been

                                                                                                                                  65

                                                                                                                                  13 SIMULATION RESULTS

                                                                                                                                  66

                                                                                                                                  GATE LEVEL

                                                                                                                                  FigVGA CONTROLLER

                                                                                                                                  67

                                                                                                                                  GATE LEVEL

                                                                                                                                  Fig SYNC UNIT

                                                                                                                                  68

                                                                                                                                  TECHNOLOGY SCHEMATIC

                                                                                                                                  69

                                                                                                                                  14 APPLICATIONS

                                                                                                                                  Motion Tracking Systems

                                                                                                                                  Image processing systems

                                                                                                                                  Displaying Systems

                                                                                                                                  141 CONCLUTION

                                                                                                                                  Hence by implementing FPGA taking a processor-generated picture

                                                                                                                                  (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                                  each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                                  to correctly display the picture on a connected monitor

                                                                                                                                  70

                                                                                                                                  REFERENCES

                                                                                                                                  1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                                  2 A VGA display controller by Eduardo Sanchez

                                                                                                                                  3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                                  71

                                                                                                                                  • Pixel
                                                                                                                                    • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                                    • Gates
                                                                                                                                    • Market size
                                                                                                                                    • 122 FPGA Comparisons

                                                                                                                                    13 SIMULATION RESULTS

                                                                                                                                    66

                                                                                                                                    GATE LEVEL

                                                                                                                                    FigVGA CONTROLLER

                                                                                                                                    67

                                                                                                                                    GATE LEVEL

                                                                                                                                    Fig SYNC UNIT

                                                                                                                                    68

                                                                                                                                    TECHNOLOGY SCHEMATIC

                                                                                                                                    69

                                                                                                                                    14 APPLICATIONS

                                                                                                                                    Motion Tracking Systems

                                                                                                                                    Image processing systems

                                                                                                                                    Displaying Systems

                                                                                                                                    141 CONCLUTION

                                                                                                                                    Hence by implementing FPGA taking a processor-generated picture

                                                                                                                                    (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                                    each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                                    to correctly display the picture on a connected monitor

                                                                                                                                    70

                                                                                                                                    REFERENCES

                                                                                                                                    1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                                    2 A VGA display controller by Eduardo Sanchez

                                                                                                                                    3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                                    71

                                                                                                                                    • Pixel
                                                                                                                                      • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                                      • Gates
                                                                                                                                      • Market size
                                                                                                                                      • 122 FPGA Comparisons

                                                                                                                                      GATE LEVEL

                                                                                                                                      FigVGA CONTROLLER

                                                                                                                                      67

                                                                                                                                      GATE LEVEL

                                                                                                                                      Fig SYNC UNIT

                                                                                                                                      68

                                                                                                                                      TECHNOLOGY SCHEMATIC

                                                                                                                                      69

                                                                                                                                      14 APPLICATIONS

                                                                                                                                      Motion Tracking Systems

                                                                                                                                      Image processing systems

                                                                                                                                      Displaying Systems

                                                                                                                                      141 CONCLUTION

                                                                                                                                      Hence by implementing FPGA taking a processor-generated picture

                                                                                                                                      (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                                      each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                                      to correctly display the picture on a connected monitor

                                                                                                                                      70

                                                                                                                                      REFERENCES

                                                                                                                                      1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                                      2 A VGA display controller by Eduardo Sanchez

                                                                                                                                      3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                                      71

                                                                                                                                      • Pixel
                                                                                                                                        • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                                        • Gates
                                                                                                                                        • Market size
                                                                                                                                        • 122 FPGA Comparisons

                                                                                                                                        GATE LEVEL

                                                                                                                                        Fig SYNC UNIT

                                                                                                                                        68

                                                                                                                                        TECHNOLOGY SCHEMATIC

                                                                                                                                        69

                                                                                                                                        14 APPLICATIONS

                                                                                                                                        Motion Tracking Systems

                                                                                                                                        Image processing systems

                                                                                                                                        Displaying Systems

                                                                                                                                        141 CONCLUTION

                                                                                                                                        Hence by implementing FPGA taking a processor-generated picture

                                                                                                                                        (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                                        each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                                        to correctly display the picture on a connected monitor

                                                                                                                                        70

                                                                                                                                        REFERENCES

                                                                                                                                        1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                                        2 A VGA display controller by Eduardo Sanchez

                                                                                                                                        3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                                        71

                                                                                                                                        • Pixel
                                                                                                                                          • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                                          • Gates
                                                                                                                                          • Market size
                                                                                                                                          • 122 FPGA Comparisons

                                                                                                                                          TECHNOLOGY SCHEMATIC

                                                                                                                                          69

                                                                                                                                          14 APPLICATIONS

                                                                                                                                          Motion Tracking Systems

                                                                                                                                          Image processing systems

                                                                                                                                          Displaying Systems

                                                                                                                                          141 CONCLUTION

                                                                                                                                          Hence by implementing FPGA taking a processor-generated picture

                                                                                                                                          (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                                          each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                                          to correctly display the picture on a connected monitor

                                                                                                                                          70

                                                                                                                                          REFERENCES

                                                                                                                                          1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                                          2 A VGA display controller by Eduardo Sanchez

                                                                                                                                          3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                                          71

                                                                                                                                          • Pixel
                                                                                                                                            • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                                            • Gates
                                                                                                                                            • Market size
                                                                                                                                            • 122 FPGA Comparisons

                                                                                                                                            14 APPLICATIONS

                                                                                                                                            Motion Tracking Systems

                                                                                                                                            Image processing systems

                                                                                                                                            Displaying Systems

                                                                                                                                            141 CONCLUTION

                                                                                                                                            Hence by implementing FPGA taking a processor-generated picture

                                                                                                                                            (pixilated) from memory space the Controller provides digital RGB values for

                                                                                                                                            each pixel as well as horizontal and vertical synchronization signals in order

                                                                                                                                            to correctly display the picture on a connected monitor

                                                                                                                                            70

                                                                                                                                            REFERENCES

                                                                                                                                            1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                                            2 A VGA display controller by Eduardo Sanchez

                                                                                                                                            3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                                            71

                                                                                                                                            • Pixel
                                                                                                                                              • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                                              • Gates
                                                                                                                                              • Market size
                                                                                                                                              • 122 FPGA Comparisons

                                                                                                                                              REFERENCES

                                                                                                                                              1 httpenwikipediaorgwikiVideo_Graphics_Array

                                                                                                                                              2 A VGA display controller by Eduardo Sanchez

                                                                                                                                              3 HinnerrdquoVGA Interface and video signal documentsrdquo

                                                                                                                                              71

                                                                                                                                              • Pixel
                                                                                                                                                • 6 SIGNALS IN VGA CONTROLLER
                                                                                                                                                • Gates
                                                                                                                                                • Market size
                                                                                                                                                • 122 FPGA Comparisons

                                                                                                                                                top related